<<

Research Collection

Doctoral Thesis

Development of Quantum Applications

Author(s): Heim, Bettina

Publication Date: 2020

Permanent Link: https://doi.org/10.3929/ethz-b-000468201

Rights / License: In Copyright - Non-Commercial Use Permitted

This page was generated automatically upon download from the ETH Zurich Research Collection. For more information please consult the Terms of use.

ETH diss. eth no. 27053

DEVELOPMENTOFQUANTUMAPPLICATIONS

A thesis submitted to attain the degree of doctor of sciences of eth zurich (Dr. Sc. ETH Zurich)

presented by bettina heim M. Sc. ETH Zurich born on 2 July 1989 citizen of Switzerland

accepted on the recommendation of Prof. Dr. M. Troyer, examiner Prof. Dr. R. Renner, co-examiner Prof. Dr. H. Katzgraber, co-examiner

2020 To my husband and family. ABSTRACT

The aim of this thesis is to identify practical applications where quantum could have an advantage over what is achievable with conventional means of computing, and what advances are needed in order to actualize that potential. We investigate the possibilities on both devices (analogue quantum computers) as well as on gate-based devices (digital quantum computers). Quantum annealing devices in particular have grown in size and capabilities over the last decade. While they are natural candidates for providing improved solvers for NP-complete optimization problems, the number of , accuracies and available couplings are still limited and their aptitude has yet to be confirmed. We use Monte Carlo methods and conceive an adaptive annealing to assess the options to leverage them for the construction of satisfiability filters. We furthermore investigate their prospective as heuristic solvers for the traveling salesman problem, and what future enhancements in terms of geometries and couplings would be beneficial for that. Based on our simulations there is no reason to expect any benefits to leveraging analogue quantum computers over state-of-the-art classical methods. However, we see an implementation of annealing schemes on future digital devices as a promising approach that doesn’t suffer from the same issues that impede performance on analogue devises. To that effect, we construct and implement an efficient quantization of the Metropolis-Hastings . Opposed to following the common way of quantization à la Szegedy that is usually defined with respect to an oracle, we reformulate the walk to closely mimic the classical algorithm and thus circumvent having to rely on costly quantum arithmetics. Our proposed realization thereby can lead to substantial savings. While theoretical arguments promise a quadratic speedup in the asymptotic limit, we numerically confirm that a polynomial speedup in terms of minimal total time to solution can be achieved for pragmatic use. We explore the prospects of using quantum walks in a heuristic setting and estimate the gate times the would be required to outperform a classical . Finally, we elaborate on the role of programming languages, and how software tools can accelerate the advancement of the field. We discuss unique aspects of and the purpose of Q# in particular, and conclude by highlighting what developments are needed for to live up to its potential.

iii ZUSAMMENFASSUNG

Das Ziel dieser Dissertation ist es praktische Anwendungen zu identifizieren, bei denen Quantencomputer einen Vorteil gegenüber herkömmlichen Methoden erbrin- gen, sowie welche Fortschritte nötig sind, um dieses Potential umzusetzen. Wir untersuchen die Möglichkeiten dafür im Bezug auf analoge Quantencomputer sowie digitale. Insbesondere analoge Hardware ist im Laufe des letzten Jahrzehnts grösser und fähiger geworden und ist ein naheliegender Kandidat zur Lösung von NP-kompletten Problemen. Jedoch sind die Anzahl von Qubits, deren Genauigkeit und die verfügbaren Kopplungen limitiert, sodass ihre Eignung erst noch bestä- tigt werden muss. Wir verwenden Monte Carlo Methoden und entwickeln einen adaptiven Prozess um ihre Nützlichkeit für die Konstruktion von SAT-Filtern abzuschätzen. Ebenso evaluieren wir ob sie als besonders gute heuristische Löser für das Rundreiseproblem fungieren können, und welche zukünftigen Entwicklun- gen bezüglich Geometrie und Kopplungen der Machinen dafür vorteilhaft sind. Basierend auf unseren Simulationen schliessen wir, dass es keinen Grund gibt anzunehmen, dass solche anologe Hardware klassischen Methoden überlegen sind. Eine Implementierung dieses Prozesses auf digitaler Hardware hingegen könnte vielversprechend sein. Wir entwickeln daher eine effiziente Quantisierung des Metropolis-Hastings Algorithmus die mittels digitalen Quantencomputern einge- setzt werden kann. Im Gegensatz zu einer Quantisierung gemäss Szegedy, welche im Üblichen ein Orakel benutzt, erreichen wir es mit unserer Formulierung kostspie- lige Quanten Arithmetik zu vermeiden. Die präsentierte Umsetzung erweist sich damit als wesentlich kostensparender. Theoretische Argumente basierend auf dem asymptotischen Limit versprechen, dass ein quadratischer Geschwindigkeitsvorteil mittels Quantisierung erreicht werden kann. Wir bestätigen mit einer numerischen Studie eine polinomiale Überlegenheit für die heuristische Anwendung und geben die nötigen Geschwindigkeiten für die Ausführung von Instruktionen an, um einen klassischen Supercomputer zu übertreffen. Letztendlich diskutieren wir die Rolle von Programmiersprachen und wie geeignete Software die Ausarbeitung von Quantenapplikationen beschleunigen kann. Wir erklären die einzigartigen Aspekte beim Programmieren von Quantencomputern und den Zweck der Sprache Q#. Abschliessend heben wir die Entwicklungen hervor, die stattfinden müssen damit Quantencomputing seinem Potential gerecht wird.

iv ACKNOWLEDGEMENTS

I would like to thank my husband Stefan Baumann, without whose support this work would not have been possible. I would like to thank my parents whose unwavering faith in me has always pushed me to be a better person. I would like to thank Matthias Troyer who not only has been a fantastic mentor but is also a true inspiration.

Furthermore, I would like to thank the collaborators that contributed to the work discussed in this thesis. First and foremost, I would like to thank the co-authors of the papers that served as the basis for this thesis (in alphabetical order): Alan Geller, Andres Paz, Christopher Granade, Daniel Herr, Dave Wecker, David Poulin, Ethan Brown, Guglielmo Mazzola, Jessica Lemieux, John Azariah, Krysta Svore, Mariia Mykhailova, Mario Könz, Marlon Azinovi, Martin Roetteler, Mathias Soeken, Matthias Troyer, Sarah Marshall, and Vadym Kliuchnikov. Thanks go to Ilia Zintchenko for providing the translation of a k-SAT prob- lem to an Ising glass problem, and to Alex Kosenkov, Ilia Zintchenko, and Ethan Brown for providing and supporting the whiplash framework for easier management of measurement data. I thank Andreas Elsener, Donjan Rodic, Guang Hao Low, Guiseppe Carleo, Guillaume Duclos-Cianci, Jeongwan Haah, José Luis Hablützel Aceijas, Matt Hastings, and Thomas Häner for stimulating discussions. I thank Ali Javadi-Abhari, Julien Ross, Margaret Martonosi, and Peter Selinger for sharing their insights and input on quantum programming languages. Last but not least, I would like to express my gratitude the work of the people who set up and maintain the spinglass server [1] that was used for the calculation of energies.

v CONTENTS

List of Figures ix List of Tablesx List of Programs xi 1 introduction1 1.1 Quantum states and qubits ...... 2 1.2 Analogue quantum computing ...... 4 1.3 Digital quantum computing ...... 5 2 traveling salesman problem 10 2.1 Mapping the TSP to an annealing problem ...... 11 2.1.1 Transition in quantum annealing . . . . . 11 2.1.2 Encoding as permutation matrix ...... 12 2.1.3 An improved encoding ...... 15 2.2 Numerical results ...... 17 2.3 Digital quantum annealing ...... 20 3 sat filters 22 3.1 Set membership problem and filter construction ...... 23 3.2 Quality metrics for filters ...... 25 3.2.1 False positive rate ...... 25 3.2.2 Filter efficiency ...... 26 3.3 Obtaining independent solutions ...... 26 3.3.1 Using annealing ...... 27 3.3.2 Using SAT-solvers ...... 30 3.4 Numerical results ...... 30 3.4.1 Diversity of solutions ...... 31 3.4.2 Quality of solutions ...... 32 3.4.3 Scaling with problem size ...... 37 3.5 Possible improvements ...... 38 4 annealing 39 4.1 Adaptive schedule ...... 39 4.2 Calculation of observables ...... 42 vi contents vii

4.3 Numerical results ...... 43 5 quantum walks 47 5.1 Szegedy’s quantum walk ...... 48 5.1.1 Eigenvectors and eigenvalues ...... 49 5.1.2 Adiabatic state preparation ...... 50 5.2 Quantization for Metropolis-Hastings algorithm ...... 51 5.2.1 Construction of Szegedy’s walk oracle ...... 53 5.2.2 Alternative walk ...... 54 5.3 Optimization heuristics ...... 59 5.3.1 Algorithm based on the quantum Zeno effect . . . . . 60 5.3.2 Algorithm based on a unitary walk ...... 61 5.4 Numerical results ...... 62 5.5 Irreversible parallel walk ...... 65 6 quantum programming 69 6.1 Domain-specific concepts ...... 70 6.2 Quantum ...... 72 6.3 Quantum model of computation ...... 74 6.4 Quantum software frameworks today ...... 76 6.4.1 Use cases ...... 78 6.4.2 Tools ...... 79 6.4.3 Ecosystems ...... 81 7 quantum programming languages 84 7.1 Purpose of programming languages ...... 84 7.2 Languages and ecosystems overview ...... 86 7.2.1 Q# ...... 88 7.2.2 OpenQASM and ...... 92 7.2.3 ...... 98 7.2.4 Quipper ...... 102 7.2.5 Scaffold ...... 105 8 domain-specific language q# 108 8.1 Design principles ...... 108 8.2 Program structure and execution ...... 109 8.3 Global constructs ...... 113 8.3.1 Type declarations ...... 113 8.3.2 Callable declarations ...... 114 8.3.3 Specialization declarations ...... 116 viii contents

8.4 Statements ...... 121 8.4.1 management ...... 123 8.4.2 Variable declarations and updates ...... 124 8.4.3 Returns and termination ...... 126 8.4.4 Conditional branching ...... 127 8.4.5 Loops and iterations ...... 129 8.4.6 Call statements ...... 130 8.4.7 Other quantum-specific patterns ...... 132 8.5 Expressions ...... 134 8.5.1 Operators, modifiers, and combinators ...... 134 8.5.2 Conditional expressions ...... 140 8.5.3 Partial applications ...... 144 8.5.4 Copy-and-update expressions ...... 145 8.5.5 Item access for user defined types ...... 147 8.5.6 Contextual and omitted expressions ...... 147 8.6 Type system ...... 148 8.6.1 Singleton tuple equivalence ...... 149 8.6.2 Immutability ...... 149 8.6.3 Quantum-specific data types ...... 150 8.6.4 Callables ...... 151 8.6.5 Operation characteristics ...... 152 8.6.6 Type parameterizations ...... 156 8.6.7 Subtyping and variance ...... 159 8.7 Debugging and testing ...... 162 9 into the future 164

bibliography 169 LISTOFFIGURES

Figure 1.1 Bloch sphere ...... 3 Figure 2.1 Required spin flips to resolve tour crossings ...... 14 Figure 2.2 Number of connections for subtours found by annealing 18 Figure 2.3 Gate-based implementation of energy penalties for sub- tours ...... 21 Figure 3.1 Number of unique SAT-solutions obtained by annealing 31 Figure 3.2 FPR of the constructed SAT-filters ...... 32 Figure 3.3 Efficiency of the constructed SAT-filters ...... 33 Figure 3.4 Hamming distance between randomly selected solutions 34 Figure 3.5 of finding similar SAT-solutions ...... 35 Figure 4.1 Optimized annealing schedule ...... 44 Figure 4.2 Median residual energy for various annealing schedules 45 Figure 5.1 Quantum and classical minimum total time to solution for a 1D ...... 62 Figure 5.2 Quantum and classical minimum total time to solution for a random sparse Ising model ...... 63 Figure 5.3 Residual energy of an Ising model on a complete graph using quantum walks ...... 67 Figure 8.1 Working with clean versus borrowed qubits...... 124

ix LISTOFTABLES

Table 2.1 Comparison of the number of spin flips required to resolve crossings ...... 15 Table 3.1 Average Hamming distance between two 4-SAT solutions found by different solvers...... 35 Table 5.1 Upper bound on the complexity of each component of the walk operator ...... 55 Table 5.2 Logical gate times required to outperform a supercom- puter ...... 65 Table 7.1 Overview of features of the discussed quantum program- ming languages...... 87 Table 8.1 Overview over available operators in Q#...... 135 Table 8.2 Expression modifiers and combinators in Q#...... 139

x LISTOFPROGRAMS

Program 7.1 Q# code implementing teleportation of an arbitrary ...... 91 Program 7.2 OpenQASM code implementing teleportation of an arbitrary quantum state...... 94 Program 7.3 Python code used to execute OpenQASM programs using Qiskit...... 95 Program 7.4 Qiskit code to generate the OpenQASM circuit for teleporta- tion...... 96 Program 7.5 Cirq code implementing teleportation of an arbitrary quan- tum state...... 100 Program 7.6 Quipper code implementing teleportation of an arbitrary quantum state...... 103 Program 7.7 Haskell code that instantiates the target machine for Pro- gram 7.6...... 104 Program 7.8 Scaffold code implementing teleportation of an arbitrary quantum state...... 106 Program 8.1 Q# command line application executing a quantum Fourier transformation...... 111

xi 1 INTRODUCTION

Quantum computing exploits quantum phenomena such as superposition and entanglement to realize a form of parallelism that is not available to traditional computing. It offers the potential of significant computational speed-ups in , , , and ma- chine learning. Having been a long-standing topic of discussion among scientists, quantum computing has more and more become the focus of public interest as well. Its potential to be more powerful than any conven- tional means of computing and claims that it can revolutionize the way hard optimization problems are solved [2] have piqued the interest of industry. While there is great hope for the merits that quantum computing could bring, there is also an understanding that any practical application requires algorithmic advances and specialized tools to conquer the challenges unique to working with quantum systems. To build a quantum computer for com- mercially viable applications we need innovations both on the hardware and on the software front. Our effort must include not only a scalable hardware platform, but equally important a sustainable software stack that is capable of supporting its design, control, and the innovations that will eventually lead to a clear advantage of quantum over classical. The idea of a software stack is to build layers of abstraction that encapsulate the required knowledge about different aspects relevant for execution, thus making it available for other layers to build on without having to be aware thereof.

In this thesis we dive into what quantum technology can offer, and what developments are needed to enable large scale quantum computing. We start by looking into some potential applications for quantum annealing devices before looking at the corresponding options and implementations on digital quantum computers. Finally, we muse about how to build a scalable software stack for quantum computing, before detailing the role and purpose of quantum programming languages, and concluding with an outlook into the future.

1 2 introduction

1.1 quantum states and qubits

Fundamentally, a quantum computer is a machine that stores and processes . While classical computing models a binary system and employs boolean logic for computation, quantum information is stored in quantum states of matter, and computation is effected by quantum inter- ference. Just as individual binary values are stored in bits, quantum states are stored in quantum bits or qubits for short. We will start with a brief introduction of the basic units of quantum information and a description of quantum states.

An idealized quantum system consisting of n logical qubits used for n computation is described by a complex state vector in C2 . For instance, the state of a single- system can be written as |ψi = α|0i + β|1i, for . 1 . 0 some complex numbers α and β, where |0i = (0) and |1i = (1) are vectors forming the computational basis for a single qubit. The components α and β are called amplitudes and a state of the above form is called a superposition of the basis states |0i and |1i if α 6= 0 6= β. The amplitudes of a quantum state cannot be observed directly. Instead, measurements are required to extract any information about a quantum state. Measurements are probabilistic and returns only n bits of information for a quantum system with 2n basis states; measurement selects one output state at random. Evolving under a carefully constructed transformations enables quantum interference to modify the states’ amplitudes and in turn the probability distribution from which one draws the measurement output. Born’s rule provides that the absolute value of the amplitude squared maps to probability space; when measuring the quantum state |ψi = α|0i + β|1i in the computational basis for example, either state |0i or |1i is observed with probability |α|2 or |β|2, respectively. One may notice that global phase eiδ does not change these probabilities. In fact, it does not impact any observable properties of the state. This ambiguity is a result of our simplification to represent the state of the quantum system as vector. More generally, the state of a two-level quantum system would indeed need to be represented not as a vector, but instead as a full in SU(2). However, quantum computing is merely concerned with pure states, allowing for the simplification of representing the relevant states as a vector rather than a matrix. For a two-level quantum system (a qubit), the isomorphism between the lie algebra for the group of unitary and hermitian matrices SU(2) and 1.1 quantum states and qubits3

ˆz = |0i

|ψi

θ ˆy ϕ ˆx

−ˆz = |1i

Figure 1.1 of a single-qubit state via Bloch sphere the lie algebra of the group of three dimensional rotations SO(3) allows for an nice geometric representation. A single qubit state is hence often visualized as a point on the surface of the so-called Bloch sphere. Within This geometric representation, a state is defined by spherical coordinates θ and φ. The angle θ describes the colatitude with respect to the Z-axis and φ θ iφ θ the longitude with respect to the Y axis: |ψi = cos 2 |0i + e sin 2 |1i, with 0 ≤ θ ≤ π and 0 ≤ φ < 2π. The pure states that are of interest for quantum computing are on the surface of the Bloch sphere, while points within the sphere correspond to mixed states. The visual interpretation unfortunately breaks down for the multi-qubit case.

The states of a multi-qubit system consisting of unentangled subsystems can be described in terms of the tensor products of valid states for individual subsystems. For instance, a two-qubit state can be constructed as |01i = |0i ⊗ |1i = (0 1 0 0)t. The set of valid states within the entire computation space is then found by the closure under linear combinations, subject to the consistency requirement that any valid quantum state√ must be normalized as |hψ|ψi|2 = 1. The two-qubit state (|00i + |11i)/ 2 is therefore a valid state, despite that there does not exist any pair of single-qubit states that we can assign to each individual qubit. That certain states are impossible to express as a tensor product of the states of individual subsystems implies a particular form of correlation between these subsystems that we refer to as entanglement. 4 introduction

1.2 analogue quantum computing

Some of the earliest physical realizations aspiring to leverage the power of quantum for computing were quantum annealing devices. Such devices compute via (quasi-)adiabatic evolution; the adiabatic theorem [3, 4] states that a system that is in an eigenstate of an initial Hamiltonian will remain in the instantaneous eigenstate when the Hamiltonian is changed slowly enough. In principle, adiabatic quantum computing is polynomially equivalent to gate-based quantum computing [5]. In reality, current implementations in commercial quantum annealing devices [6–8] provide only a limited set of couplings such that only certain kinds of problems can be solved. In particular, unconstrained binary optimization problems (QUBOs) are prime candidates for problems where annealing devices can potentially bring some benefits. QUBO problems can be mapped trivially to an Ising Hamiltonian with local fields given in Eq. (1.1). Finding the ground state of such Ising spin glass problems in three or more dimensions is an NP-complete problem [9], and many well-known problems from the field of can be mapped to such a description [10, 11]. To accommodate hardware restrictions, a QUBO usually needs to be mapped to a more restricted format such as, e.g., the Chimera graph implemented by D-Wave devices [6]. To perform computations, the idea of quantum annealing [12–16] is to start in the easy to prepare ground state of an interacting Hamiltonian and then gradually tune the interaction strength to transition into the ground state of a more complex target Hamiltonian. Within the context of annealing, physical qubits are often referred to as spins, owed in partial due to the fact that currently realizable target Hamiltonians commonly have to take Ising spin glass form z z z HP = ∑ Jss0 σs σs0 − ∑ hsσs (1.1) hs,s0i s z where σj is the Pauli-z matrices acting on spin j, and the sum goes over all coupling spins. The gradual change of the Hamiltonian is described by two monotonic functions A and B, with A(0) = 1, B(0) = 0 and A(T) = 0, B(T) = 1, such that the Hamiltonian at a time t is given by

H(t) = A(t)HD + B(t)HP for t ∈ [0, T].

A common choice is A(t) = 1 − t/T and B(t) = t/T. The relation between the interaction or driver Hamiltonian determines the dynamics during 1.3 digital quantum computing5

evolution. A usual choice for HD is given by the Hamiltonian of a transverse magnetic field: x HD = −Γ ∑ σi . i In analog devices, thermal as well as quantum fluctuations can excite the system, making quantum annealing an approximate solver that will generally find states close to but not necessarily the exact ground state; crossings or near-crossings in the energy spectrum allow for transitions into excited states (see also Ref. [17]). The size of the spectral gap – if any – determines how slowly the system has to be evolved [18]. Stoquastic Hamiltonian such as the Ising Hamiltonian in a transverse field can be efficiently simulated via methods. Even though this is true for any stoquastic Hamiltonian, in general it is a highly nontrivial endeavor to identify a suitable set of basis states that would allow to do so. Furthermore, there are recent efforts to realize nonstoquastic Hamiltonians [19]. To simulate quantum annealing on a classical computer, one option is to use path integral Monte Carlo, where the d-dimensional quantum system is mapped onto a classical system in d+1 dimensions via a Suzuki-Trotter decomposition, introducing an additional “imaginary time” dimension [20].

Quantum annealing in spirit is closely related to its classical counterpart; thermal annealing. Thermal annealing consists of keeping a system in thermal equilibrium with a heat bath while the temperature is slowly decreased to almost zero. If the annealing process is slow enough and the final temperature low enough, the system is forced into its ground state. The idea to simulate this process to solve optimization problems using Markov chain Monte Carlo methods was introduced by Kirkpatrick, Gelatt and Vecchi in 1983 [21, 22]. Whether thermal excitations or quantum fluctuations (tunneling) are more effective to drive the system out of local minima while exploring the problem space depends on the energy landscape or cost function for the problem [23–25].

1.3 digital quantum computing

Gradually tuning a set of couplings and field strengths to continuously evolve the state of a computation is in contrast to how digital devices perform computations. Digital quantum computers – much like classical computers – operate by implementing a discrete set of instructions. These 6 introduction

are combined to approximate arbitrary transformations of the quantum state during program execution. At the lowest level, quantum algorithms are built from a handful of primitive quantum instructions, just like classical algorithms at lowest level are made of primitives like AND, NOT, and OR gates. Within the context of this these, we will call an instruction that has no effect on the program state other than manipulating the quantum state a quantum gate. A sequence of gates operating on a set of qubits is traditionally called a . In this section, we will briefly introduce the most basic quantum primi- tives, before elaborating how this historic point of view relates to expressing quantum programs at a similar abstraction level as we are used to in classi- cal . We will then cover this topic in much more depth in the chapters VI - VIII.

From the perspective of quantum algorithms research, how we think and reason about quantum algorithms is rooted in the mathematical concepts that are at the foundation of quantum computing. Within this model, a computation corresponds to a sequence of mathematical transformations applied to the state vector. These transformations are described by 2n × 2n complex unitary matrices or measurements. A unitary matrix is a matrix whose inverse is given by its conjugate transpose, also referred to as its adjoint. For example, a single-qubit state can be transformed from |0i to |1i and back via the X operator, represented by the unitary matrix ! 0 1 X .= .(1.2) 1 0

The operator X can be seen as a coherent version of a classical NOT gate, mapping an arbitrary superposition α|0i + β|1i to α|1i + β|0i. That transformation applied to a particular qubit within a multi-qubit system is obtained by taking the tensor product of X with the identity operator 1 on the remaining qubits. The effect of applying X to the second qubit within a three-qubit system, for example, is represented by the unitary operator 1 ⊗ X ⊗ 1. We adopt the convention of specifying transformations by their actions on the transformed subsystem and will leave the extension to the entire system implicit. The X operator is one of three Pauli operators that together with the identity form a basis for all unitary transformations 1.3 digital quantum computing7 on a single qubit. The matrix representations for the other two operators Y, and Z are ! ! 0 −i 1 0 Y .= , Z .= .(1.3) i 0 0 −1

Any single-qubit unitary transformation can be expressed as a linear com- bination of these operators. In order to transition from this mathematical perspective to a formulation that is more aligned with how computer programs are commonly expressed, it is convenient to introduce the single- qubit operators H and T given by ! ! 1 1 1 1 0 H .= √ , and T .= .(1.4) 2 1 −1 0 eiπ/4

The Hadamard operator H is the transformation between the eigenbases of X and Z, while T rotates a qubit around the Z-axis. Its square S .= T2 is the transformation between the X and Y eigenbases. From the given matrix representation we can see that the Hadamard operator H maps |0i → √1 (|0i + |1i), and |1i → √1 (|0i − |1i), whereas up to a global 2 2 phase, T rotates a qubit around the Z-axis by an angle π/8. In order to obtain a finite set of discrete instructions that are universal for arbitrary multi-qubit unitary transformations only one additional in- struction is needed. A common choice is the CNOT operator acting on two qubits. It maps |x, yi → |x, x ⊕ yi. This operator is represented by the unitary matrix   1 0 0 0   0 1 0 0 CNOT .=   ,(1.5)   0 0 0 1 0 0 1 0 and is often referred to as the controlled-NOT, or CX, gate. Conditioned on the state of the first qubit, the transformation X is applied to the second qubit. Such a conditional transformation in a sense can be seen as the “quan- tum version” of a conditional branching, where both branches can execute simultaneously if the condition given by the state of the first qubit being |1i is a superposition of being satisfied and not satisfied [26]. The concept of this coherent branching can be extended to performing arbitrary multi- qubit transformations conditioned on the state of multiple qubits. Just like 8 introduction

any unitary transformation, such controlled transformations can be broken down into a sequence of simpler one- and two-qubit transformations [27, 28]. While the ambiguity in the description of quantum state up to a global phase is merely an artifact of our choice to work with the simplified de- scription of the quantum state as a vector instead of the full density matrix, it is worth pointing out that an irrelevant global phase can quickly become local and thus relevant when a gate is executed conditionally on the state of other qubits.

The introduced operators T, Hadamard, and CNOT form a universal gate set, i.e., any target unitary transformation can be approximated to arbitrary precision over this gate set. Along with measurement, they can thus be used to express any [29–31]. The corresponding gate set is of- ten called the Clifford+T gate set. Examples for which such approximations are needed include rotations around Pauli axes and are common in many −iθP/2 quantum algorithms. The corresponding operator is RP(θ) = e , where P ∈ {X, Y, Z} denotes one of the Pauli matrices and θ ∈ [0, 2π) is a rotation angle. For further reading about approximations over the Clifford+T and other gate sets, we refer to [32–34] and the references cited therein. Besides efficient algorithms for approximation of gates, the Clifford+T gate set has also several benefits when it comes to circuit optimization and rewriting, see e.g. [35]. When implemented fault-tolerantly, gates can have widely different cost. In typical fault-tolerant architectures, T gates are not provided natively but rather have to be created by a special distillation process [36]. This process can create large overheads, depending on the underlying noise level and other architectural parameters. In the Clifford+T gate set, the cost of a T gate that is several orders of magnitude higher than the cost of H, Z, X, S, and CNOT which are all elements of the so-called Clifford group. This motivates to just count the total number of T gates required by a quantum algorithm implementation and the total circuit T-depth when trying to parallelize as many Ts into stages as possible. While the Clifford+T gate set is sufficient to approximate any unitary transformation, it is often convenient to express certain transformations in terms of larger multi-qubit gates. An example for a popular three-qubit gate is the Toffoli gate, also called CCNOT gate. It maps |x, y, zi → |x, y, xy ⊕ zi. A possible origin of its popularity lays in the close relation between quan- tum computing and classical reversible computing; any unitary operator by 1.3 digital quantum computing9 definition is reversible since for a unitary matrix U it holds that UU† = I, where † represents the complex conjugate transpose matrix. The mapping defined by the Toffoli gate makes it universal for reversible classical com- puting, since it is capable of implementing any reversible Boolean function given enough zero-initialized ancillary bits.

So far we have covered how to express the reversible parts of a quantum computation. While large parts of a quantum algorithm can be described by unitary transformations, ultimately classical information needs to be extracted for use by non-quantum hardware. Such an extraction is achieved by projecting the state onto the eigenspaces of a unitary operator which yields the corresponding eigenvalue as measurement result. In principle, a quantum state can be projected onto an eigenspace of an arbitrary unitary operator. In practice, only a limited set of measurements are available as hardware instructions, and any other projection is achieved by suitable transformations beforehand. For example, to project a single-qubit state onto the eigenstates of the X operator, one would apply the basis trans- formation H, perform the measurement with respect to the computational basis, and reverse the basis switch by applying H again. Similarly, to project onto the eigenstates of the Y operator, the basis transformation is given by (SH)† = HS†, with the reversion SH. Despite being non-unitary, projec- tive measurements can be used to manipulate [37] and are an integral part of a wide class of quantum algorithms. A simple CNOT gate, for example, can be performed using leveraging entanglement, measurements, and single-qubit operations applied conditionally on mea- surement outcomes [38].

A quantum program can be seen as a sequence of primitive instructions such as the elements of the Clifford+T gate set. The sequence is herein generated by a classical algorithm, where the generating algorithm con- tains classical control flow that potentially depends probabilistically on the execution of preceding transformations. Such dependencies arise if the program continuation is conditioned on the outcome of measurement results, and are widely used in particular in the form of repeat-until-success patterns [39–41] and in iterative phase estimation based algorithms [42–44] used in applications such as Hamiltonian simulation [45]. 2 TRAVELINGSALESMANPROBLEM

This chapter contains content from Publication IV.

With progress in quantum technology more sophisticated quantum an- nealing devices are becoming available. Quantum technology is maturing to the point where, for specially selected problems, it can compete with classi- cal computers. Particularly, quantum annealing (QA) devices – performing quantum optimizations by slowly evolving toward a target Hamiltonian – and their potential have been a recent source of controversy. While they offer new possibilities for solving optimization problems, their true poten- tial is still an open question. For a fair assessment of their potential it is necessary to take a close look at the real world problems they strive to solve, and how they can be implemented on a given device. Moreover, how to design such algorithms is becoming increasingly relevant as more and more sophisticated models are starting to become available [46]. As the optimal design of adiabatic algorithms plays an important role in their assessment, we illustrate the aspects and challenges to consider when implementing optimization problems on quantum annealing hardware based on the ex- ample of the traveling salesman problem (TSP).

In this chapter we address factors that determine the performance of quantum annealing algorithms and formulate guidelines for their devel- opment. We discuss the issues that need to be considered when designing specialized quantum hardware and illuminate the challenges and pitfalls of adiabatic quantum computing by examining the case of the traveling salesman problem. We will see that tunneling between local minima can be exponentially suppressed if the quantum dynamics are not carefully tailored to the problem. Furthermore we demonstrate that inequality con- straints, in particular, present a major hurdle for the implementation on analog quantum annealers. Programmable digital quantum annealers can overcome many of these obstacles and can – once large enough quantum computers exist – provide an interesting route to using quantum annealing on a large class of problems.

10 2.1 mapping the tsp to an annealing problem 11

2.1 mapping the tsp to an annealing problem

In order to solve an optimization problem by annealing, its solution needs to be encoded into the ground state of the target Hamiltonian. With quan- tum annealing being an approximate solver, it is preferable that in fact all low energy states correspond to solutions that are close to optimal - and to only those. Since the commutation relation between the target and driver Hamiltonian determines the dynamics during evolution, the chosen encod- ing additionally has to permit the use of a simple enough to implement driver that allows for fast transitions between potential solutions. While in principle it is possible to solve an arbitrary problem on an annealing device, its quantum nature as well as architectural limitations im- pose restrictions on the cost functions and possibly constraining conditions that can be realized. Optimally implementing a given problem thus requires a well chosen mapping onto a suitable target Hamiltonian. The choice of this mapping significantly influences the performance of the algorithm and its scaling with problem size. Whether or not a problem can be solved efficiently by annealing thus depends on both the available hardware and the chosen algorithm.

Given N cities and distances dij between them, the task of the traveling salesman problem is to find the shortest possible roundtrip that visits each city exactly once. Since current devices provide only local fields and tunable two-site couplings between adjacent qubits, any target Hamiltonian has to correspond to an Ising spin glass.

2.1.1 Transition probabilities in quantum annealing

The ground state of an Ising spin glass can be found by quantum annealing, where the system evolves according to a time dependent Hamiltonian

t t H(t) = (1 − )H + H for 0 ≤ t ≤ T.(2.1) T D T P

Consider the transition probability for transitions between two states ψ0 and ψ1 that each represent a valid tour. Both states are then eigenstates of HP. Assume the system is in a state ψ0 at a time t0 . T. The probability that 12 traveling salesman problem

x the state at a time t0 + ∆t . T is ψ1, if we choose HD = −Γ ∑e σe , h¯ = 1, can be approximated by

Γ Γ P (∆t) = sin2 |a(ω )| cos2 |a(ω )| + O(|I | + 2r) t0 ∏ T j ∏ T k 1 j∈I1 k∈I2

where r  N, and

4 4t (t + ∆t) ω | ( )|2 = + 0 0  2( k ) a ωk 4 2 sin ∆t ωk ωk 2 ∆t2 4∆t ω ω + − ( k ) ( k ) 2 3 sin ∆t cos ∆t (2.2) ωk ωk 2 2

The set I1 contains all spins whose state differs between the two valid tours, I2 all remaining spins. The terms ωk are the frequencies belonging to the transition given by flipping spin k. While they depend on the state of the surrounding spins, we can neglect this dependency for term up to order O(|I1| + 2r) for r  N. We thus make the approximation that the frequencies ωk are independent on the transition path from ψ0 to ψ1 in Eq. (2.2).

∆t In the limit T → ∞, const = T  1, the lowest order contribution to the transition amplitude simplifies to

s Γ∆t s 1 A(s)(∆t) =  (2.3) T ∑ ∏ E − E⊥ shortest i=1 0 γi paths γ

where s is the minimal path length to transition between the two states. E⊥ The energies γi are the eigenvalues of the intermediate states along the transition path, and thus path dependent.

2.1.2 Encoding as permutation matrix

To formulate a TSP as annealing problem, we need to represent every possi- ble valid roundtrip as a spin configuration. The straightforward encoding is to associate each roundtrip with a permutation matrix aik, where aik = 1 if the i-th city is visited at time k of the tour, and zero otherwise [11]. With z the mapping aik = (1 − σik)/2 the Hamiltonian can be formulated in terms of quantum spin variables. We then need to ensure that the ground state 2.1 mapping the tsp to an annealing problem 13 corresponds to the encoding of the shortest roundtrip. Minimizing the tour length given by the Hamiltonian

Hl = ∑ dijaikajk+1 for ajn+1 ≡ aj1 (2.4) i,j,k subject to the constraints ∑i aik = 1 ∀k and ∑k aik = 1 ∀i accomplishes our goal. These two requirements guarantee that (Mij) is indeed a permutation matrix. They can be implemented by constraint terms

 !2 !2 Hc = ∑  1 − ∑ aij + 1 − ∑ aji  ,(2.5) i j j which add an energy penalty to states violating them. The ground state of the Hamiltonian

(perm) . HP = Hl + ηHc ˜ 1 z z di z = ∑ dijσikσjk+1 + ∑( + η(N − 3))σik 4 i,j,k ik 2 η z z + σ σ 0 + const 4 ∑ s s hs,s0i

˜ . 1 with di .= ∑ (dij + dji) j6=i 2 therefore provides the desired TSP solution. The last sum is over all neigh- boring spins, where we consider two spins to be neighbors, if they either represent the same city or the same time during the tour. The factor η has to be chosen large enough to ensure that the ground state indeed corresponds to a tour configuration; η ≥ max{dij/2}. Such a formulation requires (N − 1)2 spins, as we can fix city N to be the last city in the tour, each of which has 2(N − 2) neighbors. For an N-city TSP we hence in principle require (N − 1)2 qubits and 3(N − 2)(N − 1)2 couplers. Given a typical QA architecture with a small bounded number of couplers per qubit, one will rather need O(N3) qubits.

The quantum driver Hamiltonian HD determines the dynamics of the annealing process and should provide an efficient near-adiabatic evolution towards HP without ending up in an excited state. The usual choice is x a transverse field term Hx = −Γ ∑i,j σij, which induces single spin flips. 14 traveling salesman problem

Before contemplating more complex alternatives it is useful to understand the influence of HD on the annealing efficiency.

Figure 2.1 A) A crossing requiring up to 4bN/4c single spin flips to resolve for a permutation mapping, and only 4 for a symmetric TSP represented by a graph mapping. B) Worst case for N = 18, r = 3: Using a permutation mapping, resolving r crossing requires up to 2N − d(N − (r − 1))/(r + 1)e single spin flips.

Consider the probability to transition between two tours of similar length, as shown in Figure 2.1A. This transition can be performed by a so-called 2-opt update [47, 48], which is a common and very efficient primitive move in classical heuristics. Using the above mapping this, however, requires to update m = O(N) variables, since we have to change the order in which half of the cities are visited. The probability to transition between these two tours towards the end of the annealing process is thus – in leading order – proportional to Γ/∆)m, see Eq. (2.3), where ∆ is the scale associated with the barriers between the two solutions. A simple crossing, as shown in Figure 2.1A, is therefore difficult to resolve since the transition probability is exponentially suppressed (in the problem size N) compared to classical heuristics that can directly implement a 2-opt update.

We thus see that the choices of mapping HP and HD affect which updates to a configuration are efficiently realized during quantum annealing, and this directly and significantly impacts performance. The above exponential slowdown might be avoided by a better choice of HD or HP. Following the first route we could opt to permute several cities using multi-qubit couplers. While resolving a crossing may still entail O(N) steps and the exponential suppression remains, this may nevertheless significantly improve transition probabilities by avoiding high energy intermediate configurations that violate constraints. In fact, such kinetics could allow sampling of only viable TSP solutions, which would render the constraints of Eq. (2.5) unnecessary, 2.1 mapping the tsp to an annealing problem 15 and thereby simplify the energy landscape that needs to be explored [49]. However, the pairwise exchange of all two-city pairs requires O(N4) four- spin couplers, which is infeasible for all but the smallest problems.

2.1.3 An improved encoding

Independent on the exact dynamics induced by HD, the transition prob- ability declines exponentially with the required number of moves for a transition between two given states. In order to design a mapping that allows for an efficient realization of 2-opt (or more generally k-opt) moves in the quantum annealer, we use N(N − 1) spins to represent not the per- mutation, but the the travelled connections between cities, leading to a cost function 0 Hl = ∑ dijaij.(2.6) i,j While the required number of moves for a transition between two TSP solutions is approximately the same for both mappings in the asymmetric case, the second map has a clear advantage in the symmetric case. Focusing on symmetric TSPs, a 2-opt update using such an encoding as connection graph only requires the flipping of m = 4 spins. More general k-opt move requires just m = 2k flips, independent of the problem size. Such a mapping thus avoids the exponential slowdown of the previous one. Table 2.1 gives an overview for the number of single spin flips required to resolve r crossings in both mappings.

permutation graph mapping mapping

minimal number r+1 of required spin flips 4b 2 c 4r

maximal number N−(r−1)  r of required spin flips 2 N − d r+1 e 4

Table 2.1 Required number of spin flips to resolve r crossings in the symmetric TSP compared for both mappings

In the case of a symmetric TSP, ie. dij = dji ∀ i, j, we can associate aij = aji with the undirected edges between cities i and j, and the number of required 16 traveling salesman problem

1 spins reduces to 2 N(N − 1). While the the number of required qubits seems to be comparable at N(N − 1)/2, this number can be substantially reduced by truncating the set of considered edges. Along the optimal tour, cities are connected almost exclusively to nearby cities. In fact, the probability of connecting to the l-th farthest city decreases exponentially with l for random problems instances. We can thus truncate the set of considered edges originating at a city to a small number of L closest cities. This substantially reduces the number of required qubits to NL/2 = O(N).

Implementing the constraints

TSP solutions are subject to the constraint that the set of edges with aij = 1 form a valid tour. We expect each city to be connected to exactly two other cities. Closed tours can be enforced by adding a constraint term

!2 0 Hc = ∑ 2 − ∑ aij with aij = aji (2.7) i j6=i

These constraint terms require O(NL2) 2-qubit couplers, substantially less than the O(N3) terms required for the first mapping. While this term enforces a configuration consisting of closed loops where each city is visited exactly once, it does not in fact enforce that all visited cities belong to the same loop: the tour can break up into disjoint subtours instead of one tour connecting all cities. Depending on the specific variant of the TSP this may or may not be desired – one may, for example, want to know if using multiple salesmen is preferred. In that case the target Hamiltonian

(graph) de z η z z H = ( + η(N − 5))σ + σ σ 0 + const P ∑ 2 e 4 ∑ e e e he,e0i

will serve our purpose. The last sum over neighboring spin pairs includes all pairs of spins representing connections with one common start or end point.

However, for randomly generated problems many of the subtours are not particularly interesting. Evaluating 100 random problems with N = 12 and uniformly distributed cities in a 2D-plane using CPLEX [50, 51] shows that the ground state of 75% of the instances splits into subtours and a majority of these subtours contain only three cities. For larger problem sizes, it is likely that here too, we will frequently obtain solutions consisting of a large number of subtours containing only a small number of cities. 2.2 numerical results 17

If we rather want to insist on exactly one closed loop, additional precau- tions will have to be taken. Directly enforcing a single closed tour would require N-qubit coupling terms and is unrealistic. The standard procedure to avoid such undesired states is to iteratively add terms that penalize the specific subtour breakups encountered during the optimization. Given a breakup into, e.g., two sets of cities A and B, we add an inequality constraint of the form ∑ ∑ aij > 0. (2.8) i∈A j∈B Unfortunately, such an inequality constraint is hard to implement with two-qubit couplings in an Ising model quantum annealer. Approximating the step function of an inequality by a k-th order polynomial requires implementing O(N2k) k-spin couplings. Luckily, an evaluation using CPLEX [50] shows that for the ground states of our instances there are very few required connections; around 94% of disconnected subtours should have merely two connections with each other and the remaining 6% should form four connections. For these a simple quadratic energy penalty !2 0 η C − ∑ ∑ aij (2.9) ∈A j∈B with a constant C = 2 to favor two connections or C = 3 to equally favor two and four connections would be sufficient. Such constraint terms increases the number of couplings to O(N2L2), which is still a better scaling than in the original mapping. The algorithm to obtain an estimate for the TSP solution then consists of first annealing the minimally constrained 0 0 system described by HP = Hl + ηHc. If the best solution found splits into subtours, we add additional constraints given in Eq. (2.9) before repeating the annealing. This procedure is repeated until a solution consisting of a single closed tour is found.

2.2 numerical results

We analyzed the effectiveness of this algorithm by numerical simulations on problems with N = 8, 12 and 16 cities. We focus our discussion here on the main results for the case N = 12. We investigated 100 random TSPs with the cities uniformly distributed on a square. We start by testing the subtour suppression strategy using the MIQP solver of CPLEX [50]. To avoid any complications due to competing con- 18 traveling salesman problem

straints, we first analyze the performance of the outlined algorithm when choosing C = 2 for all iterations. This should enforce the correct behavior for the majority of instances where only two connections between subtours are required. Indeed, after one iteration almost all subtours require merely two connections with only one needing four, and after just two iterations the optimal TSP solution is found for 95% of these systems.

st 100 1 Iteration 4th Iteration MCS 80 1 80 10 2 60 10 103 40 104 60 5 20 10 106 0 7 2 4 6 8 10 12 10 40 probability in %

20

0 2 4 6 8 10 12 14 16 18 2 4 6 8 10 12 14 16 18 number of connections in shortest roundtrip

Figure 2.2 Distribution of the number of connections that a subtour found by annealing should have during the first and forth iteration in order to be consistent with the TSP solution. As the state after annealing is generally an excited state, the number of connections can be quite high even for problems where the ground state subtours require only very few connections - even more so the farther we are from the ground state. The legend denotes the number of Monte Carlo steps (MCS) used for annealing in both panels. The inlay in the “1st Iteration” panel shows the distribution for the ground state subtours obtained by an exact solver.

Even though iteratively adding constraints works reasonably well with exact solvers, we found that it fails with heuristic solvers, such as QA, simu- lated QA (SQA), or classical (SA). We show SA results but expect the observations to carry over to SQA and QA. The algorithm succeeds in finding the TSP solution only in very few cases. The reason for this failure is the limited probability of finding the absolute minimum. Unfortunately, this does not simply translate into a larger number of rep- etitions before the algorithm terminates. Contrary to the ground states, a significant number of the subtours found by annealing should have more 2.2 numerical results 19 than two connections in the TSP solution.

As can be seen in Figure 2.2, a poor annealing performance significantly reduces the chance of introducing an appropriate set of constraints. Since enforcing the wrong number of connections - that is one inconsistent with the TSP solution - during any one repetition implies that the roundtrip obtained at the end of our algorithm is not of minimal length, the success probability of our algorithm decreases exponentially with the number of iterations. In an effort to mitigate the detrimental effects resulting from the uncer- tainty about the required number of connections one could pursue several strategies. Adding a penalty function that has multiple minima, e.g., at C = 2 and C = 4 requires O(N8) four-spin couplings and is thus not likely to be implementable in the near future. Instead, one might try to choose C = 3 in order to equally favors two or four connections, given that an even number of connections is enforced. As the obtained subtours can contain a similar set of cities for several iterations, the ratio η/η0 then needs to be successively increased with each iteration; otherwise the ground state configuration corresponds to broken tours with three connections between subsets of cities. This creates an unfavorable and very rough energy land- scape, where an annealer has barely any chance of finding the ground state. A potential alternative is to use slack variables s1 ... sm, sk ∈ {0, 1}, for each subset A of m cities forming a subtour. One can then implement soft constraints by introducing energy penalties

 m 2  m 2 0 00 η ∑ ∑ aij − ∑ 2ksk + η ∑ sk − 1 . i∈A j6∈A k=1 k=1

Engineering a suitable energy landscape, however, poses similar challenges, and transitions between solutions with a different number of connections can be heavily suppressed.

We thus conclude that analog quantum annealing devices are unlikely to be of interest as TSP solvers in the near future. The traveling salesman problem demonstrates many important aspects to consider in the design of both adiabatic quantum algorithms and specialized hardware. A so far under appreciated aspect is that quantum dynamics has to be an impor- tant consideration in designing the mapping of an application problem to Ising spin variables. Using transverse fields (or any other local term) for the quantum dynamics incurs an exponential slowdown in the standard 20 traveling salesman problem

faithful mapping of TSP to Ising spins, compared to efficient 2-opt updates. An alternative mapping, which avoids this slowdown and improves the dynamics, comes at the cost of requiring additional constraints to prevent a breakup into subtours. The limitation to quadratic penalty functions in current analog annealing devices constitutes a major problem. In partic- ular the need for inequality constraints presents a major hurdle for the implementation on such devices.

2.3 digital quantum annealing

Virtually all of the above mentioned issues can be remedied by a “digital” implementation on a gate-model quantum computer that simulates the time evolution of quantum annealing by splitting the propagation into m discrete time steps ∆t [52, 53]. Implementing a constraint ∑i xi = a as a m 2 2 quadratic function (∑i xi − a) requires m /2 couplers, which results in O(m2) qubits assuming limited connectivity. The same constraint can be implemented in a digital simulation as a phase rotation conditioned on whether the constraint is satisfied or not. Using just O(m) qubits this can be implemented in time O(log m) (see Figure 2.3a). With this approach the constraint (2.7) requires only O(N2) instead of O(N3) qubits and the cost for the constraint in Eq. (2.8) is O(N2). The scaling of the required number of qubits is thus quadratically improved from O(N4) to O(N2) (or from O(N2L2) to O(NL) when using a cutoff L for the number of neighboring cities considered).

A digital implementation of quantum annealing on a universal quantum computer simulating QA has several other advantages:

• Embedding the program into a specific hardware graph imposes at most linear overhead in runtime, opposed to potentially exponential slowdown of quantum tunneling due embedding into a system with low connectivity in an analog approach. • The programmability of the digital computer allows efficient imple- mentation of a large class of cost functions and penalty terms; The flexibility offered by a universal digital quantum computer offers more choices of quantum dynamics, including 2-opt moves. • All penalty terms in the cost function can be implemented much more efficiently, reducing the scaling of the number of qubits with problem size. 2.3 digital quantum annealing 21

• The inequality constraint in Eq. (2.8) can now be implemented without heavy approximations. • A more even energy landscape allows for better annealing perfor- mance. • removes calibration errors.

Figure 2.3 a) The left side shows the circuit implementing an energy penalty if a certain subset of cities is disjoint from the rest (inequality constraint in Eq. (2.8)). The qubits x1 ... xm represent all possible connections between the subset and the other cities, the qubits e1..em−2 are additional ancilla qubits initialized to |0i (the graphic shows m = 6). The 2(m − 2) Toffoli gates can be executed in O(log m) time. Open circles denote conditioning on the connections xi not being part of the current tour configuration. The unitary U is a phase gate that implements the propagator corresponding to an energy penalty η0 during one step of the annealing process by 0 adding a phase exp(−iB(t)η ∆t/¯h) if the qubit is set. b) As a thought experiment, we could toy with the idea of whether it would theoretically be possible to eliminate all subtours at once. Albeit certainly not of practical value, the circuit on the right illustrates the recursion step for such a circuit. The circuit results in the ancilla qubit a1 being set if and only if there is a closed subtour of length k + 2. The red gates are necessary only to minimize the number of qubits. While this circuit illustrates the principle, using more ancillas allows one to reduce the scaling of the number of gates with system size.

We thus see digital quantum annealers as a promising route to quantum optimization, also because they allow more tailored types of quantum dy- namics to be programmed and – with error correction – solve the calibration and error problems of analog devices. We will further explore this option in Chapter 5. 3 SATFILTERS

This chapter contains content from Publication V.

Set membership testing, i.e. determining whether a specific element is a member of a given set in a fast and memory efficient way, is important for many applications, such as searches. The usage of filters, such as the Bloom filter [54], can give tremendous improvements in terms of resource costs. In 2012, Weaver and collaborators introduced a new type of filter, called satisfiability filter [55], which are based on a specific type of Boolean satisfiability problem (SAT), namely random k-SAT problems, and can achieve a higher efficiency than other popular filters [55–57]. Satisfia- bility filters are a promising type of filters for set membership testing. The creation of a satisfiability filter relies on finding solutions to SAT problems; to construct a high-quality filter it is necessary to find disparate solutions to hard random k-SAT problems. There are many methods which can solve these SAT problems but re- cently with the advent of quantum annealing (QA) a fundamentally dif- ferent method was introduced that can potentially be leveraged for the construction of such filters. A D-Wave Two device has recently been used to construct SAT filters [58] by constucting and solving SAT problems which can be realized with two-qubit couplings. However, the chimera-graph of the D-Wave device does not allow arbitrary connections and thus restricts the types of problems that can be implemented without embedding.

In this chapter, we investigate whether quantum annealing (QA) on an arbitrary connected and completely coherent device can be advantageous for the creation of SAT filters. We compare simulated annealing, simulated quantum annealing and WalkSAT, an open-source SAT solver, in terms of their ability to find suitable k-SAT solutions. Our simulations use an efficient implementation of simulated quantum annealing (SQA), which has been shown to be indicative of the performance of stoquastic Quantum Annealing devices [23, 25, 59]. We compare the obtained SQA result to those with simulated annealing (SA) and WalkSAT (WS) [60]. Section 3.3 give a brief introduction to SA and SQA related to SAT instances and explains the measurement set-up.

22 3.1 set membership problem and filter construction 23

3.1 set membership problem and filter construction

The set membership problem is defined as follows: Given an element x ∈ D and a set Y ⊆ D, determine whether x ∈ Y. For typical applications, |Y| is very large. Y could, for example, contain all words used on a specific domain and x could be a key-word entered into a search engine. Such queries can be sped up using filters, which are mathematical objects created from the set Y. Such a filter can be queried with an element x and returns one of the following answers: Either the element is definitely not in the set or the element might be in the set and further checks have to be performed to obtain a definite answer. These additional tests may be computationally more costly. Hence, a useful filter should have a low rate of indecisive answers for elements not being in the set. Furthermore, a filter should ideally require little storage.

Let Y ∈ D be the set that should be encoded by the filter and let m denote the cardinality of Y. To construct a filter encoding the set Y, a set of k hash functions is chosen which map each element of D uniformly at random into a set of k distinct literals such that the literals form a clause of the form

Ci = li,1 ∨ li,2 ∨ · · · ∨ li,k when they are combined by logical disjunction (indicated by ’∨’). Here, li,j are non-complementary literals given by a Boolean variable (x f ) or its negation (x¯ f ), chosen from a set of n variables x1, x2,... xn. The number k of literals within a clause is called the width of the clause. Since the literals cannot be complimentary, k is also the number of variables in the clause. The set of hash functions is used to map the m elements of Y to m clauses of width k. Using logical conjunctions (indicated by ’∧’), the resulting m clauses are then combined to a random k-SAT problem of the form

C1 ∧ C2 ∧ · · · ∧ Cm

This is known as the conjunctive normal form (CNF) of the Boolean satisfia- bility problem. The problem of deciding whether a given Boolean formula can be evaluated to true by consistently assigning the values true or f alse to the appearing variables is also known as k-SAT problem. We talk about a random k-SAT problem when each of the m clauses is drawn uniformly, independently and with replacement from the set of all width k clauses [61]. The problem of deciding whether a random k-SAT problem is satisfiable was the first problem proven to be NP-complete [62] for k ≥ 3. 24 satfilters

A SAT solver is used to find s different solutions to the random k-SAT problem constructed from set Y. These solutions are stored and constitute the filter. To query the filter with an element, the same k hash functions are used to map this element to a clause of width k. If at least one of the stored solutions does not satisfy the newly created clause, the element cannot be in the set from which the solutions where created. The hash functions, however, may map to a clause that is satisfied for all solutions by chance alone, such that a positive result does not give any further information about the set membership. One can see that more calculated solutions lead to a lower probability for an element which is not in Y to pass the filter. This constitutes a trade-off because a filter should also require little storage. Thus, to make good use of the required storage, the solutions should be as independent as possible. Two solutions are considered independent if the probability that a randomly generated clause is satisfied by one solution is uncorrelated with the probability that this clause is satisfied by the other one. This implies for example that the mean pairwise Hamming distance of a set of independent solutions with n variables each is given by n/2. Finding solutions to a random k-SAT problem which qualify as independent is time consuming and sometimes might not be possible at all. Nonetheless, this calculation is a one-time effort which only has to be done when the filter is constructed.

To ensure that the stored solutions are independent, it is also possible to use s different sets of hash functions to construct s different random k-SAT problems out of the set Y. In this case, only one solution per problem has to be found. Similar to the case described before, the s solutions constitute the filter. If such a filter is queried, the same s sets of hash functions are used to map the element to s different clauses. If the clauses are satisfied by the respective solution, the element might be in the set Y. The advantage of constructing a filter this way is that each stored solution stems from a different random k-SAT problem. Therefore, they are indeed independent of each other. The disadvantage is that more hash functions and clauses have to be evaluated every time the filter is queried, which increases the ongoing costs. Since we want to investigate the effectiveness of quantum annealing for finding multiple disparate solutions to random k-SAT problems, we hence won’t discuss this approach any further and instead focus on the first approach that uses multiple solutions to the same random k-SAT problem. 3.2 quality metrics for filters 25

For a more thorough introduction of satisfiability filters, we refer the reader to Ref. [55].

3.2 quality metrics for filters

For our investigation, we assess the quality of the constructed filters by measuring two important quantities: the false positive rate and the filter efficiency. We briefly introduce those two metrics here, and direct the reader to Ref. [55] for a more detailed discussion.

3.2.1 False positive rate

The false positive rate (FPR) of a SAT filter is the probability of the filter returning an inconclusive answer (indicated by “maybe” in the equation below) when queried with an element x ∈ D, which is not in Y, i.e.

pFP = P[F(x) = maybe|x ∈ D \ Y] where F(x) is the return of the SAT filter when queried with x. Assuming that |Y|  |D| – which is usually the case –, and that all stored solutions are fairly independent, it can be approximated by

pFP ≈ P[F(x) = maybe|x ∈ D] s ≈ (1 − 2−k) (3.1) where s is the number of solutions stored and k the width of the clauses of the SAT instance. While the assumption that all used solutions are indepen- dent is reasonable when the solutions stem from different SAT instances, it is less trivial when they are solutions to the same instance. However, there is experimental evidence that the approach of using different solutions to the same problem can lead to similar FPRs as truly independent solutions, provided their average Hamming distances are of similar magnitude. Finding such solutions can be costly such that the construction of the filter generally takes longer. The benefit of filters based on different solutions to the same problem are faster query times, compared to filters based on solutions to multiple instances. 26 satfilters

3.2.2 Filter efficiency

A SAT filter needs n · s bits to store s solutions with n variables each. For a filter encoding a set Y with cardinality |Y| = m that is based on s solutions to a k-SAT problem involving n variables, the filter efficiency is defined as − log p E ≡ 2 FP ≤ 1 sn/m

where pFP is the false positive rate of the filter. Using the approximation in Eq. (3.1), we can approximate the efficiency by − log (1 − 2−k) E ≈ 2 = −α log (1 − 2−k) n/m χ 2

where we have defined the clauses-to-variables-ratio αχ of the SAT instance from which the solutions were generated: number of clauses m α .= = χ number of variables n Asymptotically, the satisfiability of a random k-SAT problem is deter- mined with high probability by the clauses-to-variables-ratio; for every value of k, there exists a threshold α0(k), such that for a large number of variables, SAT instances with a clauses-to-variables-ratio αχ < α0(k) are almost certainly solvable whereas those for which αχ > α0(k) are almost surely not solvable. The threshold α0(k) is called satisfiability threshold, and in Ref. [57] was proven to be k α0(k) = 2 (ln(2) − O(k)) Looking at that threshold value, we see that the information theoretical limit (E = 1) can be reached by increasing the width k of clauses, as was shown by Weaver and collaborators [55]. Increasing k also increases the complexity of the problem, but even for a relatively small value of k = 3, a better performance compared to the constant efficiency of Bloom filters [54] (E ≤ ln(2)) can be achieved.

3.3 obtaining independent solutions

Even though it is a one one-time effort to find solutions as independent as possible during the construction of sat filters, quantities such as the false positive rate and query time heavily depend on the success of this step. We will therefore focus our effort on evaluating and comparing different methods for obtaining independent solutions for a single k-SAT instance. 3.3 obtaining independent solutions 27

3.3.1 Using annealing

For a random k-SAT instance with m clauses involving n variables, the configuration of the system is defined by an n-bit value x = (x1,..., xn), where xi ∈ {true, false} for 1 ≤ i ≤ n. We can defined a cost function

E(x) = |{Ci|Ci(x) = f alse}| i.e. the cost of a configuration is given by the number of unsatisfied clauses. Our goal is to minimize this cost function, and a configuration x with minimal energy E(x) = 0 is a solution to the SAT-problem. Annealing methods are widely used optimization heuristics for min- imizing cost functions. Their effectiveness depends on the shape of the cost function. Depending whether local minima are separated by tall and narrow peaks, or flat and wide ones, quantum fluctuations as leveraged by quantum annealing or the thermal fluctuation used by classical annealing may be more suitable to escape local minima while exploring the problem space [63–65].

Solution on Quantum Annealing Devices

Since filters are mainly used in big data contexts, their size can be quite large, depending on the size of the set |Y| and the required FPR. To illustrate this, we give an example for a SAT filter encoding a set Y with |Y| = 216 elements: The 216 elements result in m = 216 clauses. If the required FPR is 1 pFP = 4 , a 4-SAT filter needs to store s ≈ 22 (independent) solutions [55]. If the targeted efficiency is given by E = 0.75 each solution involves n ≈ 8136 variables. The filter therefore stores ≈ 22 · 8136 bits. A 5-SAT filter achieving the same efficiency and FPR needs to store s ≈ 44 (independent) solutions with n ≈ 4002 variables.

To solve a SAT problem by quantum annealing, each variable is identified with a logical qubit. The couplings in the target Hamiltonian are chosen so that its energy is equal to the number of unsatisfied clauses for that assignment. The ground state of the target Hamiltonian will then provide a solution to the SAT instance. An implementation on a quantum annealing device needs more physical qubits (referred to as spins in the context of spin glass systems such as the D-Wave device) than there are boolean variables due to the embedding needed to accommodate for the limited connectivity of physical devices. An implementation on current hardware is therefore not yet possible. However, 28 satfilters

the rapid development in this field might lead to physical implementations in the near future.

The way in which the problem space is explored when using quantum annealing is fundamentally different than when using classical heuristics. An open question is hence whether the kind of solutions that are found by quantum annealing is significantly different from the solutions produced by classical means. If this is the case, then quantum annealing could potentially be leveraged to construct high quality SAT-filters, potentially in combination with other methods. Combining different methods may be desirable to obtain a broader variety of solutions, since there is evidence that quantum annealing only finds few of the ground states for problems with high degeneracy (see Ref. [66] for a theoretical prediction based on quantum Monte Carlo simulations and Ref. [67] for a recent experimental verification on a D-Wave 2X quantum annealing device). In order to explore these possibilities, we resort to numerical methods to gain some insight.

Solution by Simulated Quantum Annealing

Quantum annealing of spin glasses in a transverse magnetic field can be mimicked on a classical computer using quantum Monte Carlo (QMC) simulations. Previous work [25] shows a similar scaling of characteristic tunneling rates for SQA and unitary evolution, and suggests that SQA can indeed to a certain extent predict the performance of a physical quan- tum annealer. Additionally D-wave quantum annealers where empirically shown to scale similarly to SQA for random spin glass problems [59] and a numerical partitioning problem [23]. Even though a discretization with ∆τ = 1 might produce better optimiza- tion results, in order to simulate the real quantum behavior, computations should be performed in the continuous time limit ∆τ → 0 [68]. An alter- native is to use an algorithm directly working with an infinite number of time slices, as in [69], providing the same results as QMC using sufficiently small time-steps. As Ref. [68] shows, previous expectations of quantum speedup for QA in two-dimensional spin glasses [70, 71] were due to performing simulated quantum annealing (SQA) in the non-physical limit, which elegantly ex- plains why such a speed-up could not be observed in experiments [72, 73]. For our analysis, SQA is performed using discrete time SQA close enough to the continuous time limit to guarantee similar results to a continuous 3.3 obtaining independent solutions 29 time algorithm and a physical quantum annealer.

For the purpose of our investigation, we allows for arbitrary couplings between two or more spins without requiring any additional embedding. This gives us the flexibility to probe whether there could be benefits to leveraging quantum annealing without any adverse effects due to hardware limitations. We perform simulated quantum annealing on an Ising spin glass with n distinct spins, where up to k of them couple together. The used target Hamiltonian is

H = − J σz ··· σz − · · · − J σz σz − J σz − C P ∑ i1,...,ik i1 ik ∑ i1,i2 i1 i2 ∑ i1 i1 i1<···

r where C is a constant with no physical relevance, and σij is the Pauli r- operator, for r ∈ {x, y, z}. For the driver Hamiltonian, we use a transversal field Γ(t) ≥ 0, which induces single-spin flips, i.e. transitions between the states | ↑i and | ↓i for each spin individually:

H (t) = −Γ(t) σx .(3.2) D ∑ i1 i1

Annealing parameters

For each problem specification and each given number of attempted updates (Monte Carlo steps, abbreviated as MCS), the annealing parameters for SA (the initial inverse temperature β0 and the final inverse temperature β1) are optimized to maximize the average number of different results for a set of instances when running multiple times on each instance. The number of MCS is chosen such that an increase would not lead to a significantly larger number of different solutions found by any of the tested solvers. This value is chosen the same for SA and SQA. In the case of SQA, one MCS corresponds to one attempted update per physical spin (i.e. a certain site on all replicas). The SQA parameters (β0 = β1 as well as the initial and final transversal field Γ0 and Γ1) are optimized the same way, ensuring that M is big enough to be close to the physical (i.e the continuous time) limit. At the beginning ( ) > | | | | of the annealing schedule Γ 0 J... , Jij , so that HQA is dominated by HD and the system will be in the easily attainable ground state of HD with all spins pointing along x-direction, that is each variable is in a superposition between true and false. Then, the transverse magnetic field is slowly decreased, reducing the tunneling rates while the system 30 satfilters

explores the configuration space. For our simulations we decreased the magnetic field linearly. At the end of the annealing schedule Γ(tend) = 0 and HQA(tend) = HP, such that the system freezes in a configuration that provides a guess for a solution of the SAT instance. Between Monte Carlo steps the transverse field is decreased linearly. The temperature is held constant during the annealing process. Open boundary conditions in imaginary time are used since they better reproduce the scaling with system size of a coherent quantum annealer [25, 74] and also improve convergence. When the scaling in problem size is considered, the annealing parameters are optimized to minimize the computational effort needed for finding a solution with a 99% chance.

3.3.2 Using SAT-solvers

WalkSAT (WS) [60] is an efficient open source SAT solver, also used in [55]. For our analysis version 51 from Ref. [75] was used. It was run with the flags -printonlysol=TRUE, -out output_filename, -seed SEED. To obtain multiple solutions, SEED is set to a different positive natural number for every run on the same SAT instance.

3.4 numerical results

In order to construct a filter from solutions to a single k-SAT instance, it is necessary to find multiple and disparate solutions to that instance (see Section 3.1). The number of existing solutions and how different they are depends on how close to the satisfiability threshold the SAT instance is. We investigate both aspects for a fixed filter efficiency. We focus on what kinds of solutions are found by each solver rather than considering the runtime of each solver. We consider at least 20 randomly generated SAT instances for each data set, i.e. for each value of k and clauses-to-variables-ratio. All solvers are run multiple times on each instance, after optimizing both SQA and SA parameters. The relevant quantities are calculated for each instance and then averaged over all instances to get the estimate for the entire problem class. Error bars indicate the standard error of the mean. 3.4 numerical results 31

3.4.1 Diversity of solutions

We first investigate how many different solutions are found for a given number of repetitions of various solvers. Figure 3.1 shows the average num- ber of different solutions found when running each solver 2000 times with optimized annealing parameters on a randomly generated 4-SAT instance with n = 50 variables and m = 403 clauses. If independent solutions could be assumed, this clauses-to-variables-ratio would lead to an efficiency of E ≈ 0.75. The number of found solutions is averaged over 20 randomly generated instances, where the value for each instance is reweighted by dividing it through the number of solutions found by SA within 2000 runs. From Figure 3.1 it can be seen that the effort needed for a new solution increases with the number of already found solutions. This assumes that no further techniques such as blocking clauses (see Section 3.4.3) are employed to prevent repeated solutions. The number of different solutions found by SA is bigger than the number of different solutions found by SQA. Using WS naively, by just providing different random seeds, it finds the smallest number of different solutions.

a) 1.0 SQA SA 0.8 WS

0.6

0.4

% of SA obtained solutions 0.2

0.0 0 500 1000 1500 2000 number of runs

Figure 3.1 Average number of unique solutions obtained, when running each solver 2000 times on each instance. The number of different solutions obtained is normalized by the maximum number of different solutions found by SA within 2000 runs (about 1300). The left panel shows the average over 20 randomly generated 4-SAT

instances with n = 50 variables and a clauses-to-variables-ratio of αχ4 ≈ 8.06. The right panel shows the average over 20 randomly generated 5-SAT instances with n = 50 variables and a clauses-to-variables-ratio of

αχ5 ≈ 16.36 (leading to the same efficiency of E ≈ 0.75). 32 satfilters

We performed the same measurements for 3-SAT and 5-SAT instances at a clauses-to-variables-ratio which would lead to the same efficiency of E ≈ 0.75, if independent solutions could be assumed. In the k = 3 case, the ranking among the solvers is similar. For k = 5, SQA finds the smallest number of different solutions, especially close to the satisfiability threshold.

3.4.2 Quality of solutions

The next question to answer is whether the solutions found by the different solvers are of the same quality for SAT filter construction. The quality of found solutions is determined by the FPR of a filter constructed from these solutions, which we show in Figure 3.2. In Figure 3.2a) we choose the solutions randomly. Figure 3.2b) shows a subset of solutions with particularly large Hamming distance between each other1.

a) b)

Figure 3.2 Average FPR of a filter constructed from solutions found by each solver within 2000 runs. The average is taken over 20 randomly generated 4-SAT

instances with n = 50 variables and a clauses-to-variables-ratio of αχ4 ≈ 8.06. a) The solutions used to construct the filter are chosen randomly from the found solutions. b) The solutions used to construct the filter are chosen so that they maximize the average pairwise Hamming distance between the chosen solutions. The dotted cyan line shows the theoretical FPR for a filter constructed from truly independent solutions.

1 In order to obtain this subset, a random solution was chosen initially. Afterwards, solution by solution, the solution with the highest average pairwise Hamming distance to the already chosen solutions was added to the set. 3.4 numerical results 33

The results show that the solutions found by SQA lead to a higher FPR than the solutions found by SA and WS. If more than ten solutions are chosen, all solvers provide a significantly worse FPR than truly independent solutions.

The corresponding filter efficiencies are shown in Figure 3.3. Consistent with the higher FPR, the efficiency of a filter constructed using SQA is lower than the efficiency of a filter constructed using SA or WS. Figure 3.3 also shows that only for a small number of solutions, the efficiency of the constructed filter is comparable to the efficiency resulting from independent solutions, and only if they are specifically selected. The maximum number of solutions that still leads to an efficiency comparable to truly independent solutions, is about five for SQA and between seven and ten for SA and WS. This result indicates that combining the two techniques suggested in [55] might be useful: Instead of using either multiple sets of hash-functions and one solution per instance or one hash-function and many solutions of the resulting instance, it might be preferred to take multiple sets of hash functions and still find multiple solutions per instance.

a) b)

Figure 3.3 Average efficiency of a filter constructed from solutions found by each solver within 2000 runs. The average is taken over 20 randomly generated 4-SAT instances with n = 50 variables and a clauses-to-variables-ratio of

αχ4 ≈ 8.06. a) The solutions used to construct the filter are chosen ran- domly from the found solutions. b) The solutions used to construct the filter are chosen so that they maximize the average pairwise Hamming distance between the chosen solutions. The dotted cyan line shows the theoretical FPR for a filter constructed from truly independent solutions. 34 satfilters

It is worth pointing out that the closer to the satisfiability threshold we are, the less solutions exist, and the less likely found solutions are to be independent. As long as no further techniques like blocking clauses are employed (see Section 3.4.3), the approach to look for many solutions of one instance is generally not useful when targeting a high efficiency. In order to obtain a high efficiency the instances have to be generated too close to the satisfiability threshold for many sufficiently disparate solutions to be found. A FPR as predicted by assuming independent solutions is hence not achievable. The worse performance of SQA in terms of finding disparate solutions becomes more apparent for these instances, which only have a small number of solutions.

a) b)

Figure 3.4 Pairwise Hamming distance between different solutions randomly picked from the solutions obtained by each solver within 2000 runs. The Hamming distance was calculated for up to 200 different solutions found for each of 20 randomly generated 4-SAT instances with n = 50

variables and a clauses-to-variables-ratio of αχ4 ≈ 8.06. An average was taken over the instances. a) Shows the average pairwise Hamming dis- tance between the picked solutions. b) Shows the maximum pairwise Hamming distance between the picked solutions.

The reason for the differences in the FPRs can be further investigated by looking at the average or maximum Hamming distance between solutions. The average and maximum Hamming distance, taken pairwise between different solutions found by each solver is shown in Figure 3.4, averaged over 20 instances. The result shows that the subset of solutions found by SQA have a much higher overlap than the subset of solutions found by SA and WS, resulting in a higher FPR of the SQA solutions. Without imposing 3.4 numerical results 35 further requirements, the approximation in Eq. (3.1) is worse for SQA than for the other solvers.

Furthermore, we did not find evidence that SQA finds solutions that feature a particularly large Hamming distance to the set of solutions found by the other solvers.

SQA, SQA SA, SA WS, WS SQA, SA SQA, WS SA, WS 15.6 18.9 20.2 17.9 19.1 19.7 ±1.4 ±0.6 ±0.2 ±1.1 ±0.3 ±0.4

Table 3.1 Average Hamming distance between two solutions of a randomly gen- erated 4-SAT problem, found by different solvers. The value provided is the average taken over 20 randomly generated instances with n = 50

and αχ4 ≈ 8.06. The solutions were randomly chosen among the solu- tions found within 800 runs by the respective solvers. Truly independent solutions would have an average Hamming distance of 25.

Analogous plots and evaluations for k = 3 and k = 5 leads to the same rankings and conclusions.

a) b)

Figure 3.5 Probability for a) WS and b) SA to find the SQA-hardest and SQA-easiest solutions. The x-axis indicates the percentage of easiest and hardest SQA- found solutions, running SQA 2000 times on 20 randomly generated 4-SAT instances with n = 50 variables and a clauses-to-variables-ratio of

αχ4 ≈ 8.06. An average was taken over the instances. 36 satfilters

Figure 3.6 Probability for a) WS and b) SA to find the SQA-hardest and SQA- easiest solutions, running SQA 1000 times on 20 randomly generated 5-SAT instances with n = 50 variables and a clauses-to-variables-ratio of

αχ5 ≈ 16.36 (leading to the same efficiency of E ≈ 0.75).

a) b)

Figure 3.7 Probability for a) WS and b) SA to find the SQA-hardest and SQA- easiest solutions, running SQA 1000 times on 20 randomly generated 5-SAT instances with n = 48 variables and a clauses-to-variables-ratio of

αχ5 ≈ 19.6 (leading to a higher efficiency of E ≈ 0.9). One instance, for which SQA did not find any solution, was taken out of the statistic. This clause-to-variable ratio is close to the satisfiability threshold α5 ≈ 21.11 of 5-SAT [76]. Hence, many of the randomly generated instances only feature a small number of existing solutions. Of our 20 instances, the mean number of existing solutions was 40, and two instances had less than ten. 3.4 numerical results 37

Finally, we investigate whether the solutions which are easily found by SQA, are particularly unlikely to be found by SA and WS. If SQA could easily find solutions that are hard to find for the other solvers, it could be used to contribute complementary solutions. As can be seen above, we find no evidence for that.

3.4.3 Scaling with problem size

Lastly, we compare the solvers abilities to find one solution to a given SAT problem. This is relevant if the filter is constructed by finding one solution to several different instances and might also be indicative of the solvers per- formance when blocking clauses [77] are used. A blocking clause is a clause which evaluates to true for all variable assignments except for some specific assignments, which are not desired. Adding a blocking clause to a random k-SAT problem removes all solutions with the undesired assignments of variables from the set of solutions. Blocking clauses can give a positive contribution to the energy if the configuration has already been found. Therefore, any solution found to the new problem, is necessarily different from the solutions previously obtained. It is also possible to add clauses which not only block a certain solution, but certain variable assignments. By blocking certain variable assignments it can be ensured that any solution to the new problem has a certain Hamming distance to the solutions found previously. Such procedures on the other hand make it less likely for a solver to still be able to find solutions.

We compare SQA to SA for the task of finding any one solution to a SAT instance, since the natural similarity between SA and SQA makes it easy to compare their performance. Looking at the average computational effort needed to find a solution to a random 4-SAT problem with 99% we see no evidence that SQA requires a lower computational effort for all tested problem sizes, consistent with the findings of Refs. [78–81]. There is no indication either that SQA scales better with the number of variables than SA for random SAT instances and keeping the clause-to-variable-ratio constant. We therefore conclude that SQA, for the tested instances and using a linear schedule with a stoquastic driver Hamiltonian, is not superior to SA, also when it comes to finding one solution to a SAT instance. 38 satfilters

3.5 possible improvements

Our results indicate that SQA finds a smaller number of different solutions than SA and that the found solutions are generally less disparate than the solutions found by the other solvers, which leads to a higher FPR and lower efficiency of filters constructed from them. We find no evidence that the solutions found by SQA are particularly hard to find for the other solvers tested. Furthermore, our results show no scaling advantage of SQA over SA, regarding the computational effort required to find one solution with fixed probability. SQA performs worse than other solvers on any metric that was investigated. Based on prior work, our expectations are that the observed behaviors and properties would also apply for annealing on analog quantum comput- ing devices, such that we do not expect any benefit of quantum annealing for the construction of SAT filters. The presented simulations are best-case scenarios for physical devices in the sense that even though we restricted our simulations to a simple enough driver Hamiltonian, we still assumed hardware with arbitrary connectivity and multi-spin couplings along the z-direction. We expect the performance to further decrease when the prob- lem is mapped to a specific hardware graph and few-qubit couplers.

We did not investigate nonlinear annealing schedules, which might im- prove the efficiency of QA and SQA. Improvements to annealing by con- structing an adaptive schedule are discussed in the next chapter. Similarly, a non-stoquastic driver Hamiltonian may be beneficial. As theoretically discussed in Ref. [66] and indicated by recent experimental results [67] using a D-Wave 2X quantum annealing machine, QA needs more complex (non-stoquastic) driving Hamiltonians in order to find degenerate ground states with equal probability. Furthermore, Ref. [66] predicts that QA can indeed find all degenerate ground states, if quantum transitions between them are possible. This can be done by modifying the driver Hamiltonian in Eq. (3.2), so that it induces quantum transitions between all states:

Hnew(t) = −Γ(t) σx + σx σx + σx σx σx + ...  D ∑ i1 ∑ i1 i2 ∑ i1 i2 i3 i1 Unfortunately, such terms are difficult to implement both in simulations and in experimental devices. On a digital quantum computer, on the other hand, the implementation of more complex driving Hamiltonians should be easier. 4 ANNEALINGSCHEDULES

This chapter contains content from Publication VI.

Numerical comparisons between the performance of quantum and classi- cal annealing for two-dimensional (2D) Ising spin glass systems have not shown any quantum speedup [68, 72]. The lack of speedup in this case may be due to the energy landscape of 2D spin glasses, with shallow but broad barriers that are easier to thermally surmount than to tunnel through [73]. Extending such simulations to three dimensions we find that whether classical or quantum annealing exhibits better performance depends on the specific choice of annealing schedules. An apparent advantage of one method may simply be due to a bad or suboptimal choice of annealing schedule for the other method. To achieve a fair comparison between classical annealing, quantum an- nealing, and other optimization methods, it is hence paramount that the parameters and schedules are properly optimized, otherwise wrong conclu- sions regarding the efficiency are drawn.

In this chapter, we therefore introduce a heuristic approach for the op- timization of annealing schedules for quantum annealing by generalizing heuristics used to to optimize classical annealing, and apply it to 3D Ising spin glass problems. We demonstrated that the constructed nonlinear sched- ule is resistant against sub-optimally chosen initial values for the transverse field Γ0 on random 3D Ising spin glasses. The proposed heuristic further- more leads to improvements over naïve schedules and allows for a fair comparison of classical versus quantum annealing.

4.1 adaptive schedule

In classical annealing the control parameter is the temperature, whose change over time is given by an annealing schedule. If the change is constant we call it a linear schedule and the update rule for the temperature is given by β(t) = β0 + λ˜ t,

39 40 annealing schedules

or in a discretized form

βk = β0 + λk = βk−1 + λ,

where βk is the inverse temperature at the k-th update sweep. For such a lin- ear schedule, only the initial and final values of β may need to be optimized, with intermediate values that are obtained by linear interpolation. Instead of guessing a schedule, one can determine optimized adaptive schedules by using a heuristic algorithm to optimize the schedule in such a way that interesting regions where large changes to configurations may occur (e.g. close to phase transitions) are passed through more slowly. An indicator for the size of a temperature step can be the specific heat

d2 d C = k β2 log Z = −k β2 hEi,(4.1) V B dβ2 B dβ

where Z is the partition function. CV can be calculated from the fluctuation- dissipation theorem as

d β2C ≡ σ ≡ − hEi = hE2i − hEi2.(4.2) V dβ

In order to achieve a constant decrease in energy at each step, one aims for this quantity to be constant throughout the annealing process:

dhEi = −λ, dt where the scale factor λ sets the targeted change in energy. The quantity hEi is only implicitly dependent on time through its dependence on the temperature, such that one can perform the chain rule

dhEi dβ = −λ. dβ dt

Inserting Eq. (4.2) one obtains

dβ σ (β) = λ, dt which can be discretized to obtain the an update rule for the adaptive schedule λ βk+1 = βk + . σk 4.1 adaptive schedule 41

It can be observed that this schedule traverses slower in regions close to the phase transition, where CV is large.

In order to derive a similarly optimized schedule for quantum annealing we use a quantity akin to the specific heat in classical systems. Starting from the quantum mechanical partition function   Z(s) = tr e−βH(s) we first derive an adaptive schedule for the Hamiltonian

H = sHP + (1 − s) HD.(4.3)

Substituting the derivative with respect to β by one with respect to the quantum control parameter s in Eq. (4.1) we define

1 d2 C(s) = log (Z) , β ds2 which will determine the annealing schedule. Performing the derivatives we obtain d C(s) = − hH − H i ds P D s  2 2 = β h(HP − HD) is − hHP − HDis

Note, that the expectation values in this equation are dependent on the parameter s. Similar to the classical procedure, we aim for a constant change in time: ds C (s) = λ. dt Thus we obtain an update rule for the quantum annealing schedule:

1 λ sk+1 = sk + q β 2 2 h(HP − HD) is − hHP − HDis

Instead of following the “cross-over” schedule given in Eq. (4.3), one could also keep HP fixed and only vary the transverse field Γ:

H = HP + (1 − s) HD.(4.4) 42 annealing schedules

In that case, a similar derivation leads to the simpler rule 1 λ s + = s + . k 1 k p 2 βΓ0 1 − hσxis

Γ0 corresponds to the initial transverse field of the annealing schedule and an appropriate value, depending on the couplings and field strengths in the target Hamiltonian, usually needs to be determined beforehand for the annealing to be effective. Using an adaptive schedule, our numerical simu- lations presented in the next section indicate resilience towards suboptimal choices of Γ0.

Note that for classes of random instances, the idea is to not optimize the schedule for each individual instance but instead optimize over the average of a limited set of instances, and use the average of this set to define an adaptive schedule for the entire class.

4.2 calculation of observables

To evaluate the expectation value of the SQA counterpart of the specific heat Cq, we need to compute the value of hσxi. A derivation on how to calculate this observable using the path integral can be found in [82] and will be reproduced in the following. Generally an observable can be evaluated as: 1   hOi = tr Oe−βH Z Thus the expectation value of hσxi will evaluate to the following: x −βH ∑σ hσi|σ e |σii hσ i = i i i h | −βH| i ∑σi σi e σi M  β  β βH M H Introducing the Trotter decomposition e = e with τ = M and adding identity operators will give h | x −βH| i ∑σi σi σi e σi hσii = . k h | −βH| i ∑σi σi e σi x But there is freedom to choose for which Trotter slice to evaluate σi . So one can also take the average of all these choices: 1 M hσii = hσii ∑ k M k=1 4.3 numerical results 43

Now with the introduction of a small error of order O(τ2) one can split −τ(H + σx) −τH −τ σx e p Γ ∑i i = e p e Γ ∑i i and then find for the expectation value

k+1 −τ σx ! 1 M hσ |σxe Γ ∑i i |σki x = i i i hσi i ∑ + M k 1 −τΓ ∑i σi k k=1 hσi |e |σi i

x x Using h↑ |eaσ | ↑i = cosh(a) and h↑ |eaσ | ↓i = sinh(a) one can obtain the expectation value of the Pauli spin x operator:

M 1 k k+1 x si si hσi i = ∑ tanh (−τΓ) M k=1 For the more specific schedule relying on the Hamiltonian

N i H = Hp + Γ(t) ∑ σx i=0 one can get an expectation value for Cq by the following formula:

2 1 d  2 Cq = log(Z) = βΓ 1 − hσxi β dΓ2 Finally one should note that any constant pre-factor in the measurement of Cq will cancel further on when λ is determined by fixing the number of MCS.

4.3 numerical results

While the presented adaptive schedule is derived using the system Hamilto- nian given in Eq. (4.3) for the sake of generality, a more common annealing procedure [68, 71] is the one given by Eq. (4.4), where the problem Hamil- tonian is kept at constant strength while the transverse field is decreased from Γ0 to 0. For the remainder of this chapter, we will hence use the latter schedule, obtained by an ensemble average of hσxis over a set of 1000 random, uniformly distributed 3D Ising spin glass instances on a simple cubic lattice. We investigate 3D spin glasses because of the argument [73] that random 3D Ising spin glasses are more likely to profit from quantum tunneling since they exhibit a non-zero temperature phase transition [83].

In Figure 4.1a we show the expectation value of the denominator of the step size, averaged over 1000 instances. In the beginning, when the 44 annealing schedules

transverse field is strong, the system is close to an eigenstate of σx and the problem Hamiltonian does not influence the dynamics much. Thus, the step size is large. Figure 4.1b shows the linear and the adaptive schedules for optimized and unoptimized initial field Γ0. Determining the best choice of the starting value Γ0 for a linear schedule yields Γ0 = 1.5. The final value of the trans- verse field needs to be Γ = 0 in order to recover the problem Hamiltonian at the end of the annealing process. Consistent with Figure 4.1a one can see that the adaptive schedules are slower in regions, where the transverse field is weak.

35 7 opt linear opt adaptive unopt linear unopt adaptive 30 6

25 5 2

20 4 Γ

− 15 3 1

10 2

5 1

0 0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 s s

Figure 4.1 a) The left plot depicts the expectation value of the step size denominator as a function of s. Thus it can be seen, that initially, for small values of s annealing proceeds fast and slows down towards the end. Measurements were performed at β = 32 and Γ0 = 10. The expectation value for this plot was obtained using a set of 100 random 3D Ising instances with couplings chosen uniformly from [−1, 1]. The transverse field relates to the control parameter with Γ(s) = (1 − s) Γ0. b) The right plot shows a linear schedule and adaptive schedules with optimized and unoptimized initial transverse Field Γ0. Close to the optimal values (Γ0 ≈ 1.5) one can see only minor differences in the two schedules, but for unoptimized starting values (Γ0 = 7) the adaptive schedule anneals much faster at the beginning such that less time is spent in unimportant regions.

Since neither SA nor SQA are guaranteed to find the exact ground state, we investigate the residual energy Eres = E − E0 as a metric to compare the efficiency of different annealing schemes. It is defined as the difference between the energy of the (local) minimum E found in an annealing run and 4.3 numerical results 45

the ground state energy E0. The residual energy is a function of annealing time ta, which in our simulations is given in units of Monte Carlo sweeps. The residual energy for different schedules is shown in Figure 4.2a, which demonstrates improved performance of the adaptive schedules over the linear ones. The improvement is especially large if one starts with unopti- mized large initial transverse fields Γ0. In those cases the adaptive schedule quickly reduces the transverse field and then slows down when entering the interesting parameter region. Conversely, one can see in Figure 4.2b that, for an optimized Γ0 the optimized schedule is close to a linear one, with similar performance. As observed in Ref. [68] for 2D Ising spin glasses, SQA for short annealing times can initially lower the energy much faster than SA but tends to get stuck in local minima. Figure 4.2b shows that this initial fast convergence happens earlier with higher temperature. Yet, for every choice of inverse temperature, SQA suffers from getting stuck in local minima. SA does not display this problem, which is why SA outperforms any SQA run after sufficient annealing time.

10-1 -1 10 linear Γ adaptive Γ adaptive = 8 lin. = 8

linear Γ adaptive Γ adaptive = 16 lin. = 16

linear Γ adaptive Γ adaptive = 32 lin. = 32 lin. classical adaptive = 64 lin. = 64 lin. classical res E 10-2

10-2

102 103 104 105 102 103 104 105 ta t

Figure 4.2 a) Comparison of median residual energy between linear and adaptive schedules at different starting values Γ0. The linear schedule steadily degrades with increasingly suboptimal Γ0 whereas the adaptive schedule degrades only slightly. b) Comparison of median residual energy for different values of β between adaptive, linear SQA schedules and linear SA. One can see almost no qualitative difference between the two schedules when only optimized parameters are used.

The performance of the adaptive schedule has also been investigated on a 3D ferromagnet whose degenerate ground state was lifted by a local 46 annealing schedules

field in σz direction. The schedule’s performance degraded slightly with unoptimized parameters but the overall performance compared to unopti- mized linear schedules was still better and the same qualitative behavior was observed.

Apart from the proposed adaptive schedule, we performed a comparison of different nonlinear parameterizations of annealing schedules for SQA, but found no speedup for optimized parameters compared to the linear schedule for any of them. Specifically, in addition to the quantum annealing schedule with constant temperature, we tested the schedule described by Eq. (4.3), mixing both classical and quantum annealing where the transverse field was linearly decreased while the inverse temperature was linearly in- creased. For optimized parameters, we did not see any significant difference in performance compared to the schedule following Eq. (4.4). Finally, we would like to remark that the comparison between SA and SQA for 3D Ising spin glasses in general gives similar results as in 2D. The existence of a finite temperature phase transition does not have a major influence on the relative performance between SA and SQA, and we see same behavior as observed in Ref. [68] for 2D Ising spin glasses. While SQA rapidly finds a low energy local minimum, increasing the annealing time leads to SA finding lower energy states. A similar conclusion is drawn in Ref. [84], where quantum speed-up is obtained in the random ferromagnetic Ising chain model, i.e. a system for which SA encounters no phase-transition at any finite temperature, while (S)QA does. 5 QUANTUMWALKS

This chapter contains content from Publication II.

Markov chain Monte Carlo (MCMC) methods are a cornerstone of mod- ern computation, with applications ranging from computational science to . The key idea is to sample a distribution πx by con- structing a random walk W which reaches this distribution at equilibrium Wπ = π. One important characteristic of a Markov chain is its mixing time, the time it requires to reach equilibrium. This mixing time is governed by the inverse spectral gap of W, where the spectral gap ∆ is defined as the difference between its two largest eigenvalues. The runtime of a MCMC algorithm is thus determined by the product of the mixing time and the time required to implement a single step of the walk. Szegedy [85] presented a general method to quantize reversible walks, re- iθ sulting in a unitary transformation UW . The eigenvalues e j of a unitary ma- trix all lie on the unit complex circle, and we choose 0 = θ0 ≤ θ1 ≤ θ2 ≤ ... | i The steady state√ π of the quantum walk is essentially a coherent ver- sion |πi = ∑x πx|xi of the classical equilibrium distribution√π. The main . feature of the quantum walk is that its spectral gap δ .= θ1 ≥ ∆ is quadrat- ically larger than its classical counterpart. Combined with the quantum adiabatic algorithm [86–88], this yields a quantum algorithm to reach the steady state that scales quadratically faster with ∆ than the classical MCMC algorithm [89].

While at first glance this is an important advantage with far-reaching applications, additional considerations must be taken into account to de- termine if quantum walks offer a significant speedup for any specific application. One of the reasons is that it could take significantly longer to implement a single step UW of the quantum walk than to implement a step W of the classical walk. Thus, quantum walks are more likely to offer advantages in situations with extremely long equilibration times. Moreover, we must address the fact that classical walks are often used heuristically out of equilibrium. When training a neural network for instance, where a MCMC method called stochastic is used to minimize a cost function, it is in practice often not necessary to reach the true minimum, and 47 48 quantumwalks

thus the MCMC runs in time less than its mixing time. Similarly, simulated annealing is typically used heuristically with cooling schedules far faster than prescribed by provable bounds – and combined with repeated restarts. Such heuristic applications further motivate the constructions of efficient implementations of UW , and the development of heuristic methods for quantum computers.

In this chapter, we present a detailed realization and cost analysis of the quantum walk operator for the special case of a Metropolis-Hastings walk [90, 91]. This is a widespread reversible walk, whose implementation only requires knowledge of the relative populations πx/πy of the equilib- rium distribution. While Szegedy’s formulation of the quantum walk builds on a classical walk oracle, a direct implementation of this oracle requires costly arithmetic operations. We thus reformulate the quantum walk, cir- cumventing the implementation of such an oracle by closely following the classical Metropolis-Hastings walk. We construct a related but different quantum unitary walk operator with an effort to minimize circuit depth. We look at heuristic quantum algorithms inspired by the adiabatic algorithm that use the quantum walk in the context of discrete optimization problems and numerically study their performances.

5.1 szegedy’s quantum walk

We define a classical walk on a d-dimensional state space X = {x} by a d × d transition matrix W where the transition probability x → y is given by matrix element Wyx. Thus, the walk maps the distribution p to 0 0 the distribution p = W p, where py = ∑x Wyx px. An aperiodic walk is irreducible if every state in X is accessible from every other state in X , which implies the existence of a unique equilibrium distribution π = Wπ. Finally, a walk is reversible if it obeys the detailed balance condition

Wyxπx = Wxyπy.(5.1)

Szegedy’s quantum walk [85] is a quantization of a reversible classical walk W and formulated in an oracle setting. For a classical walk W, it assumes a unitary transformation W acting on a Cd ⊗ Cd with the following action

. W|xi ⊗ |0i = |wxi ⊗ |xi =. |φxi,(5.2) 5.1 szegedy’s quantum walk 49

. p where |wxi .= ∑y Wyx|yi. Define Π0 as the projector onto the subspace d E0 spanned by states {|xi ⊗ |0i}x=1. Combining W to the reflection R = 2Π0 − I and the swap operator Λ, we can construct the quantum walk defined by . † † UW .= RW ΛW = (2Π0 − 1)W ΛW. Szegedy’s walk is defined as ΛW(RW†ΛW)RW†, so it is essentially the square of the operator UW we have defined, but this will have no conse- quence on what follows aside from a minor simplification.

5.1.1 Eigenvectors and eigenvalues

. To analyze the quantum walk UW , let us define the state |ψxi .= Λ|φxi = |xi ⊗ |wxi and consider the operator

. † X .= Π0W ΛWΠ0 q = ∑hφy|ψxi|yihx| ⊗ |0ih0| = ∑ WxyWyx|yihx| ⊗ |0ih0| xy xy

At this point, in order to use detailed balance condition of Eq. (5.1), we need to assume that the walk is reversible to obtain s πx X = ∑ Wyx|yihx| ⊗ |0ih0| xy πy or, if we restrict the operator X to its support E0, we get in matrix notation − 1 1 X = diag(π 2 )W diag(π 2 ). The matrices X and W are thus similar so they have the same eigenvalues. Define its eigenvectors

X|γ˜ki = λk|γ˜ki where λk are the eigenvalues of W. Because the operator X is obtained † by projecting the operator W ΛW onto the subspace E0, its eigenvectors with non-zero eigenvalues in the full Hilbert space must have the form |γki = |γ˜ki ⊗ |0i. If we consider the action of W†ΛW without those projections, we get

† ⊥ W ΛW|γki = λk|γki − βk|γk i (5.3) ⊥ where |γk i is orthogonal to the subspace E0, so in particular it is orthogonal † to all the vectors |γk0 i. Finally, because W ΛW is a unitary, we also obtain 50 quantumwalks

⊥ p 2 that the |γk i are orthogonal to each other and that βk = 1 − |λk| . This ⊥ implies that the vectors {|γki, |γk i} are all mutually orthogonal and that W†ΛW is block diagonal in that basis. Given the above observations, it is straightforward to verify that q 2 ⊥ UW |γki = λk|γki + 1 − |λk| |γk i q ⊥ 2 ⊥ UW |γk i = 1 − |λk| |γki − λk|γk i

⊥ ±iθk so the eigenvalues of Uk on the subspace spanned by {|γki, |γk i} are e where cos θk = λk with corresponding eigenvectors

± 1 ⊥ |γ i = √ (|γki ± i|γ i) k 2 k

5.1.2 Adiabatic state preparation

We can use quantum phase estimation [92] to measure the eigenvalues of UW . In particular, we want this measurement to be sufficiently accurate to resolve the eigenvalue θ = 0, or equivalently λk = 1, from the rest of the spectrum. Assuming that the initial state is supported on the subspace√ E0, the spectral gap of UW√is δ = θ1 = arccos(λ1) = arccos(1 − ∆) ∼ ∆, so we only need about 1/ ∆ applications of UW to realize that measurement. This is quadratically faster than the classical mixing time 1/∆, which is the origin of the quadratic quantum speed-up. = A measurement outcome corresponding to θ√ 0 would produce the . coherent stationary distribution |πi ⊗ |0i .= ∑x πx|xi ⊗ |0i. Indeed, first note that for any |ψi such that X(|ψi ⊗ |0i) = |ψi ⊗ |0i, Eq. (5.3) implies that UW (|ψi ⊗ |0i) = |ψi ⊗ |0i. We can verify that this condition holds for |ψi = |πi: s √ πx √ X ∑ πx|xi ⊗ |0i = ∑ Wyx πx|yi ⊗ |0i x xy πy p = ∑ Wxy πy|yi ⊗ |0i xy p = ∑ πy|yi ⊗ |0i y

where we have used detailed balance Eq. (5.1) in the second step and ∑x Wxy = 1 in the last step. 5.2 quantization for metropolis-hastings algorithm 51

From an initial state |ψi ⊗ |0i = ∑k αk|γki, the probability of that mea- 2 2 surement outcome is |hψ|πi| = |α0| . Therefore, the initial state |ψi must be chosen with a large overlap with the fixed point to ensure that this mea- surement outcome has a non-negligible chance of success. If no such state can be efficiently prepared, one can use adiabatic state preparation [86, 87] to increase the success probability. In its discrete formulation [89] inspired by the quantum Zeno effect, we can choose a sequence of random walks W (0), W (1),... W (L) = W with coherent stationary distributions |πji. The walks are chosen such that |π0i is easy to prepare and consecutive walks j j+1 2 1 are nearly identical, so that |hπ |π i| ≥ 1 − L [89]. Thus, the sequence of L measurements of the eigenstate of the corresponding quantum walk 1 L 1 operators UW(j) all yield the outcomes θ = 0 with probability (1 − L ) ∼ e , which results in the desired state. The overall complexity of this algorithm is L −1 C ∑ δj j=1 (j) where δj is the spectral gap of the j-th quantized walk W and C is the time required to implement a single quantum walk operator.

5.2 quantization for metropolis-hastings algorithm

The Metropolis-Hastings algorithm [90, 91] is widely used to generate a Boltzmann distribution with applications in statistical physics and machine learning. It constructs a Markov chain obeying the detailed balance in Eq. (5.1), and can be applied to quantum mechanical Hamiltonians [93] to benefit from a quadratic speed-up using Szegedy’s quantization proce- dure [94]. The basic idea is to break the calculation of the transition probability x → y in two steps. First, a transition from x to y 6= x is proposed with probability Tyx. Then, this transition is accepted with probability Ayx and otherwise rejected, in which case, the state remains x. The overall transition probability is thus ( Tyx Ayx i f y 6= x Wyx = 1 − ∑y Tyx Ayx i f y = x. The detailed balance condition Eq. (5.1) becomes A T . yx πy xy Rxy .= = , Axy πx Tyx 52 quantumwalks

which in the Metropolis-Hastings algorithm is solved with the choice  Ayx = min 1, Rxy . We note that our quantum algorithm can also be applied to the Glauber choice, also known as heat-bath choice [95, 96] 1 Ayx = . 1 + Ryx Given a real energy function E(x) on the configuration space X, the Boltz- β = 1 −βE(x) mann distribution at inverse temperature β is defined as πx Z(β) e where the partition function Z(β) ensures normalization. In this setting, it is common practice to choose a symmetric proposed transition probabil- ity Tyx = Txy, so the acceptance probability depends only on the energy difference  β[E(x)−E(y)] Ayx = min 1, e . For concreteness, we will assume a (k, d)-local Ising model with X = {+1, −1}n, where interactions involve at most k spins and each spin in- teracts with at most 2d other spins. The energy function takes the simple form E(x) = ∑ J` ∏ xs (5.4) ` s∈Ω`

where Ω` are subsets of at most k Ising spins, and J` are real coupling constants where ` ranges over all the possible couplings. For k = 2 and d ≥ 3, finding the ground state is an NP-hard problem [97]. As it is always the case for Ising models, we will assume that the proposed transitions of the Metropolis-Hastings walk are obtained by choosing a random set of spins and inverting their signs. In other words, Tyx = f (x · y) where the product is taken bit by bit and where f (z) is some simple probability distribution on X − {1n} (it does not contain a trivial move), so Tyx is clearly symmetric. The distribution f (z) is sparse, in the sense that it has only N ∈ O(n) non-zero entries. Furthermore, we will suppose that f is uniform over some set M of moves, with |M| = N:

( 1 N i f z = x · y ∈ M Txy = . 0 otherwise The most common example consists of single-spin moves, where a single spin is chosen uniformly at random to be flipped. More generally, we will 5.2 quantization for metropolis-hastings algorithm 53

suppose that moves are sparse in the sense that each move zj ∈ M flips a constant-bounded number of spins and that each spin belongs to a constant- bounded number of different moves. With an abuse of notation, we view zj ∈ M both as Ising spin configurations and as subsets of [n], where the correspondence is given by the locations of −1 spins in zj.

5.2.1 Construction of Szegedy’s walk oracle

Quantum algorithms built from quantization of classical walks [85, 89, 98, 99] usually assume an oracle formulation of the walk operator, where the ability to implement the transformation W of Eq. (5.2) is taken for granted. Since the transition matrix elements Wxy can be computed efficiently, we know that W can be implemented in polynomial time. However, a direct implementation of the unitary W generally requires costly quantum circuits involving arithmetic operations. The complexity arises from the need to uncompute a move register and a Boltzmann coin when implementing W. This turns out to be non-trivial and costly if a move is rejected.

To see how this complexity arises, consider the following implementation of W. The computer comprises two copies of the system register, which we label Le f t and Right. It also comprises a Move register and a Coin register. Begin with the Le f t register in state x and all other registers in state |0i. Use the transformation q V : |0iM → | f iM = ∑ f (zj)|zjiM (5.5) j∈M to prepare the state |xiL ⊗ |0iR ⊗ | f iM ⊗ |0iC. Using n CNOTs, copy the state of the Le f t register onto the Right register, resulting in |xiL ⊗ |xiR ⊗ | f iM ⊗ |0iC. Apply the move zj proposed by the Move register to the Right register. If the Move register is encoded in unary representation, this requires O(N) CNOTs, and results in the state q |xiL ⊗ ∑ f (zj)|x · zjiR ⊗ |zjiM ⊗ |0iC. j∈M

Using a version of the Boltzmann coin transformation on the Le f t, Right and Coin register yields 54 quantumwalks

q q q  |xi f (z )|x · z i |z i ⊗ 1 − A |0i + A |1i L ∑ j j R j M (x·zj)x (x·zj)x j C q  q −1 = |xiL ∑ Wyz|yiR|x · yiM ⊗ Ayx − 1|0i + |1i y6=x C We reset the Move register to |0i using 2NCNOTs with controls from the Le f t and Right registers. At this point, the Move register is disentangled and discarded. Finally, we swap the Le f t and Right registers conditioned on the Coin qubit being in state |1i, resulting in the state q q ∑ Wyx|yiL|xiR|1iC + ∑ f (x · y)(1 − Ayx)|xiL|yiR|0iC. y6=x y6=x The relative weights of the two branches are the same as the classical MCMC methods. This is quite similar to the state that would result from the quantum walk operator W of Eq. (5.2), save for one detail. When the Coin register is in state |0i√, the state of the Le f t and Right register needs to be mapped to the state Wxx|xiL|xiR. Such a rotation clearly depends on all the coefficients Ayx, and all implementations we could envision used arithmetic operations that compute Axy.

5.2.2 Alternative walk

Our implementation of the walk does not make use of the unitary W of Eq. (5.2). One of the key innovations we present is to provide a complete and simplified implementation for a quantization of the Metropolis-Hastings walk that circumvents the use of W altogether. We present a circuit which is isometric to UW , that avoids the problems discussed in Section 5.2.1. . † Concretely, we construct a circuit for U˜ W .= YUW Y where Y maps ( |xi ⊗ |x · yi i f x · y ∈ M Y : |xi ⊗ |yi → (5.6) 0 otherwise. In addition to these two registers, the circuit acts on an additional coin qubit. Thus, we will denote the System, Move, and Coin registers with corresponding subscripts |xiS|ziM|biC.

Our implementation of the walk operator combines four components: † † U˜ W = RV B FBV (5.7) 5.2 quantization for metropolis-hastings algorithm 55 where V is defined in Eq. (5.5), and

b F : |xiS|zjiM|biC → |x · zj iS|zjiM|biC (5.8)

R : |0iM|0iC → −|0iM|0iC

|zjiM|biC → |zjiM|biC for (zj, b) 6= (0, 0) (5.9) q q  B : |xiS|zjiM|0iC → |xiS|zjiM 1 − Ax·z ,x|0i + Ax·z ,x|1i (5.10) j j C While these definitions differ slightly from the ones of Sec. 5.1, it can be ver- ified straightforwardly that these realize the desired walk operator, similar to our discussion in Sec. 5.1.

In what follows, we provide a complete description of each of these components, and their complexity is summarized in Table 5.1. To minimize circuit depth, the second register above is encoded in a unary represen- tation, so it contains N ∈ O(n) qubits and a move |zji is encoded as |ji .= |00 . . . 0100 . . .i with a 1 at the j-th position. Since the state is already encoded in n qubits, unary encoding adds only a small multiplicative num- ber of qubits compared to binary encoding. The System, Move, and Coin registers then contain n, N, and 1 qubit(s) respectively.

Gate 3L depth 3L count Total depth Qubits

V dlog2 Ne 2N − 1 dlog2 Ne 2N − 1

F 1 N 2dlog2 Ne 2N + n

R 2dlog2(N + 1)e 2N 2dlog2(N + 1)e 2N kd 1 kd 1 kd 1 B O(2 log2 e ) O(N2 log2 e ) O(2 log2 e ) + 2dlog2 Ne 2N + n + kd

Table 5.1 Upper bound on the complexity of each component of the walk operator. The cost is measured in terms of number of gates in the 3rd level of the Clifford hierarchy (indicated by 3L), which is equivalent to T depth up to a small multiplicative factor, and are optimized for minimal depth. The total depth includes . The given numbers are evaluated for a (k, d)-local Ising model with moves consisting of single- spin flips, meaning N = n. In general, the costs may increase by a constant multiplicative amount determined by k, d and the sparsity of the moves in M. Note that B depends exponentially on the sparsity of moves. 56 quantumwalks

Move preparation V

For a general distribution f , the method of Ref. [100] can be adapted to realize the transformation Eq. (5.5). Here, we focus on the case of a uniform distribution. To begin, suppose√ that N is a power of 2. Starting in the√ state |000 . . . 01iM, the state ∑j |jiM/ N is obtained by applying N − 1 SWAP gates in a binary-tree fashion. To see this, recall that √ √ SWAP|10i = (|01i + |10i)/ 2 √ The gate SWAP is in the third level of the Clifford hierarchy [38, 101], so it can be implemented exactly using a constant number of T gates. This represents a substantial savings compared to the method of Ref. [100] for a general distributions which requires arbitrary rotations obtained from costly gate synthesis. When N is not a power of 2, in order to avoid costly rotations, we choose to pad the distribution with additional states and prepare a uniform d e distribution over 2 log2 N states. The first N states encode the moves M of the classical walk x → y = x · zj, while the additional states correspond to trivial moves x → x. This padding has the effect of slowing down the dlog Ne classical walk by a√ factor 2 2 /N < 2, and hence the quantum walk by a factor less than 2, which is less than the additional cost of preparing a uniform distribution over a range which is not a power of 2.

Spin flip F

The transformation F in Eq. (5.8) flips a set of system spins zj conditioned on the Coin qubit and on the j-th qubit of the Move register being in state |1i. This can be implemented with at most cN Toffoli gates, where the constant c upper-bounds the number of spins that are flipped by a single move in M. The Coin register acts as one control for each gate, the j-th bit of the Move register acts as the other control, and the targets are the system register qubits that are in zj, for j = 1, 2, . . . N. No gate is applied to the padding qubits j > N. This implementation has the disadvantage of being purely sequential. An alternative implementation uses O(N) additional scratchpad qubits but is entirely parallel. The details of the implementation depends on the sparsity of the moves M, and in general there is a tradeoff between the scratchpad size and the circuit depth. When the moves consist of single-spin flips for instance, this uses NCNOTs in a binary-tree fashion (depth log2 N) to 5.2 quantization for metropolis-hastings algorithm 57 make N copies of the Coin qubit. The Toffoli gates can then be applied in parallel for each move, before undoing the CNOTs.

Reflection R

The transformation R in Eq. (5.9) is a reflection about the state |0 . . . 0iM|0iC. Using standard phase kickback methods,√ it can be implemented with a single additional qubit in state (|0i − |1i)/ 2 and a NOT gate controlled on N + 1 qubits being zero state. The latter can be realized from 4(N − 1) serial Toffoli gates [27] and linear depth. Since our goal is to minimize circuit depth, we use a different circuit layout that uses at most N ancillary qubits and 2N Toffoli gates to realize the (N + 1)-fold controlled NOT. The circuit once again proceeds in a binary tree fashion, dividing the set of N + 1 qubits into pairs and applying a Toffoli gate between every pair with a fresh ancilla in state |0i as the target. The ancillary qubit associated to a given pair is in state |0i if and only if both qubits of the pair are in state |0i. The procedure is repeated for the b(N + 1)/2c ancillary qubits in addition to the left-over qubit, if any, from the previous iteration and so on, until a single bit indicates if all qubits are in state |0i. The ancillary qubits are then uncomputed. Thus, the total depth in terms of gates in the 3rd level of the Clifford hierarchy is 2 log2 N.

Boltzmann coin B

The Boltzmann coin in Eq. (5.10) is the most expensive component of the algorithm, simply because it is the only component which requires rotations by arbitrary angles. It consists of the conditional application of N single- qubit gates Rj = exp(iθjσy), j = 1 . . . N. Specifically, conditioned on j-th qubit in the Move register being |1i and the system register being in state |xi, the Coin register undergoes a rotation by an angle q  −β∆j(x) θj(x) = arcsin min{e , 1} (5.11) for Metropolis-Hastings or s ! 1 θj(x) = arcsin (5.12) 1 + eβ∆j(x) for Glauber dynamics, where ∆j = E(x · zj) − E(x). Given the sparsity constraints of the function E and of the moves zj ∈ M, the quantity ∆j can 58 quantumwalks

actually be evaluated from a subset of qubits of the system register, namely the set Nj = {k|k ∈ Ω`, zj ∩ Ω` 6= ∅, ∀`}, such that the rotation by θj only depends on the state of the qubits in the set Nj. To implement the rotation for a move zj when the j-th qubit in the Move register is |1i, we can apply a sequence rotations conditioned on the bits in Nj taking some fixed value, and use a (classical) look-up table to determine the rotation angle, in order to avoid any quantum arithmetic. Using a single qubit for the Coin register requires that all Rj are applied sequentially. An alternative is to use a separate qubit for each move in the set of moves for which the rotation angles can be computed independently. In that case, any set of gates Rj with non-overlapping Nj can be executed in parallel, at the expense of at most N additional qubits. For single-spin flips on a (k, d)-local Hamiltonian, |Nj| ≤ 2kd by defini- tion. For multi-spin flips zj, we get |Nj| ≤ |zj|2kd. Each single-qubit rotation can be realized using O(log(1/e)) T gates, where e is the desired accuracy for the synthesis. The multi-controlled rotations require O(|Nj|) Toffoli gates along with O(log(1/e)) T gates, leading to an overall depth for a circuit realizing Rj of

1 depth(R ) = O(2|Nj||N | log ) j j e While the depth depends exponentially on the sparsity parameters of the model, for a given model, each Nj is of constant-bounded size, such that the complexity for the entire Boltzmann coin scales with O(N log(1/e)). It is likely that a high precision is needed to ensure the detailed balance condition. We leave the numerical investigation of how low the precision can be without causing significant errors for future research.

Perhaps a more efficient way to realize the Boltzmann coin is to use quantum signal processing methods [102–105]. The complexity of quantum signal processing depends on the targeted accuracy. More precisely, it scales with the number of Fourier coefficients required to approximate the desired function to a certain accuracy on the relevant interval. Quantum signal processing, or alternative methods, will offer an advantage on some models, when there are different couplings and a high number of body interactions for example. The scaling of these methods is case dependent. Indeed, it will highly depend on these couplings and the number of spin flips in each move zj. 5.3 optimization heuristics 59

5.3 optimization heuristics

The Metropolis-Hastings algorithm is widely used to heuristically solve min- imization problems using simulated annealing or related algorithms [106]. The objective function is the energy E(x). Starting from a random con- figuration or an informed guess, the random walk is applied until some low-energy configuration x is reached. The inverse temperature β is gradu- ally and monotonously increased to mimic a cooling of the system, with an initial low value enabling large energy fluctuations to prevent the algorithm from getting trapped in local minima, and large final value βmax to reach a good (perhaps local) minimum. In this section, we discuss how to use a quantum walk in the context of a minimization problem, and present possible approaches. For the presented heuristics we choose a schedule with a linearly increasing value of β up to a fixed final value, but we expect our considerations and the results in the next section to hold independent on the chosen schedule. An optimized schedule is possible in each case, but the choice of a fixed schedule whose only parameter is the number of steps L facilitates a systematic comparison.

To benchmark and compare our heuristics we will look at the minimum total time to solution [72]. Suppose we have an optimization problem for which it is easy to compare the quality of two solutions. If the probability for a certain optimization heuristic to yield an optimal (or an acceptable) solution is p, then running the same heuristic r times allows to find such a solution with probability 1 − (1 − p)r. Boosting the probability to a constant value p0 hence requires r = log(1 − p0)/ log(1 − p) repetitions of the algorithm. Correspondingly, the total time to solution is defined as the time T that each execution takes times the number of repetitions to achieve the desired success probability. It is important to realize that the success probability p for annealing based algorithms depends on how slowly the process evolves, which in turn is directly proportional to the execution cost T. The total time to solution can hence be expressed as

log(1 − p0) TTS(T) .= T . log(1 − p(T)) For our heuristics and schedule, the costs T ∼ L, and there is a compro- mise to be reached between the duration of the walk and the corresponding success probability; longer walks can reach a higher success probability and therefore be repeated fewer times, but increasing the duration of the walk beyond a certain point has a negligible impact on its success probability. 60 quantumwalks

To make a meaningful comparison between heuristics, we hence need to minimize over L and compare the minimum total time to solution:

min(TTS) .= min TTS(L) L

5.3.1 Algorithm based on the quantum Zeno effect

Section 5.1.2 explains how to prepare the eigenstate of UW with eigenvalue 1 using a sequence of walks W (0), W (1),..., W (L) = W. For our choice (j) of a linear schedule in β, W = W(βj) for βj = βmax j/L. Recalling the definitions in Section 5.1.2, for |πji the of W (j), we define the projectors ⊥ . j j j j {Qj, Qj } .= {|π ihπ |, I − |π ihπ |} Starting from the state |π0i, the Zeno algorithm consists of performing the sequence projective measurements onto the corresponding subspaces for j−1 increasing values of j. A projection of |π i onto Qj occurs with probability

2 . j−1 j 2 Fj .= |hπ |π i| The sequence of measurements succeeds if they all yield this outcome, 2 which occurs with probability ∏j Fj . Each projection requires 1/δj appli- cations of UW(j) to resolve the 0 eigenvalue via quantum phase estimation, where δj denotes the spectral gap of UW(j) . Finally, the stationary distri- bution |πLi is projected onto a classical state to obtain a solution to the optimization problem. This final measurement yields the optimal outcome x∗ with probability πL(x∗). The total time to solution for the entire algo- rithm to get the optimal solution x∗ with probability 1 − δ is therefore

log(1 − δ) L 1 TTS(L) = .(5.13) L ∗ L 2 ∑ δ log(1 − π (x ) ∏j=1 Fj ) j=1 j

In the outlined method, a complete restart of the algorithm is required if ⊥ during any step j the state is projected onto Qj . There exists an alternative to a complete restart which we call rewind. It was first described in the context of Zeno state preparation in Ref. [107], but originates from Refs. [93, 108]. For a failed projection in step j, it consists of iterating between the projections in step j − 1 and j, until the desired projection is achieved. With 2 the probability to transition between Qj−1 and Qj being Fj , it is easy to see that this is also the probability for transitioning between the corresponding 5.3 optimization heuristics 61

2 two orthogonal spaces, leading to a transition probability of 1 − Fj for the other two transitions. Given the cost 1/δj of each of these measurements, we obtain a simple recursion relation for the expected cost of a successful |πj−1i → |πji transition with rewind, and thus for the total time to solution for Zeno protocol with rewind. In Ref. [107], it was found that rewinding yields substantial savings compared to the regular Zeno strategy for the preparation of quantum many-body ground states.

5.3.2 Algorithm based on a unitary walk

One might wonder whether the projections in each step are strictly nec- essary. Suppose we start in a state |π0i and apply the sequence of walk operators UW(j) , the resulting state is

0 |ψ(L)i = UW(L) ... UW(2) UW(1) |π i,

If we were to simply perform this unitary walk, terminating in a measure- ment with respect to the computational basis to extract an approximate solution to our optimization problem, then the total time to solution for that algorithm is

log(1 − δ) ( ) = TTS L ∗ 0 2 . log(1 − |hx |UW(L) ... UW(2) UW(1) |π i| ) While we do not have a solid justification for this heuristic, a similar protocol was proposed in Ref. [109]. The motivation for the approach in Ref. [109] was to randomize the eigenphase of the instantaneous unitary operator, such that in contrast to our approach each unitary in a sequence of walk operators is applied a random number of times. When the spectral gap of a unitary operator is δ and that unitary is applied a random number of times in the interval [0, 1/δj], then the relative phase between the eigenstate with eigenvalue 1 and the other eigenstates is randomized over the unit circle, thus mimicking the effect of a measurement where the outcome is ignored. From this analogy, we might expect that our proposed unitary implementation yields a minimal total time to solution roughly equal to the Zeno-based algorithm with no rewind. But as we will see in the next section, its behavior is much better than anticipated – this method is more efficient than the Zeno algorithm with rewind, which itself is more efficient than Zeno without rewind. 62 quantumwalks

5.4 numerical results

For our we compare three algorithms: the classical walk, the Zeno-based algorithm with rewind from Section 5.3.1, and the unitary algorithm from Section 5.3.2. We look at a one-dimensional Ising model as well as an Ising model with random sparse couplings. The random graph model has gaussian couplings J` with variance 1, and the interactions sets Ω` (see Eq. (5.4)) consist of a random subset of 3.5n of all possible two- spin couplings. For both systems we use a linear schedule in the inverse temperature starting at β = 0 and ending with β = 2. For our quantum walks, the initial configuration is a uniform distribution.

Zeno Unitary

102 y=x

6.2x0.3939

0.4244 Quantum min(TTS) 2.16x

101

101 102 Classical min(TTS)

Figure 5.1 Quantum versus classical minimum total time to solution for a one dimensional Ising model of length ranging from n = 3 to 12. The line x = y is shown for reference of a quantum speedup.

Figure 5.1 shows the quantum versus classical minimal total time to solution for the one-dimensional Ising model. The results indicate a poly- nomial advantage of the quantum algorithms over the classical algorithm. Surprisingly, both quantum approaches show a similar improvement over 5.4 numerical results 63 the classical approach that exceed the expected quadratic speedup, with a power law fit of 0.42 for the unitary algorithm and 0.39 for the Zeno-based algorithm.

Zeno Unitary 104

y=x 1x0.92 103 1.65x0.7549

102 Quantum min(TTS)

101

101 102 103 104 Classical min(TTS)

Figure 5.2 Quantum versus classical minimum total time to solution for a Ising model with interaction terms consisting of 3.5n randomly chosen two- spin couplings. 100 random problem instances are chosen for each size, ranging from n = 4 to 14.

Figure 5.2 shows quantum versus classical minimum total time to solution for a random ensemble of 100 systems of each sizes n = 4 to 14. We observe that the unitary algorithm is consistently faster than the classical algorithm, with an average polynomial speedup of degree 0.75, which is less than the expected quadratic gain. The different problem instances are all quite clustered around this average behavior, suggesting that the quantum speedup is fairly general and consistent. In contrast, the Zeno- based algorithm shows large fluctuations about its average, particularly on very small problem instances. The average polynomial speedup is of degree 0.92, which is far worse than the unitary algorithm. Overall, the results 64 quantumwalks

indicate a polynomial advantage of the quantum methods over the classical method, but these advantages are much less pronounced than for the one- dimensional case. It remains an interesting question to understand more broadly what type of problems can benefit from what range of speed-up and why. In both the one-dimensional and the random graph Ising model, the unitary quantum algorithm achieves very similar and sometimes superior scaling to the Zeno-based algorithm with rewind. This is surprising given the observed improvement obtained from rewind in Ref. [107] and our expectation that the unitary algorithm behaves essentially like Zeno with- out rewind. Our observations furthermore lead to the conclusion that even though the quantum walk is traditionally defined with the help of a walk oracle, its circuit implementation does not necessarily require it, and this can lead to substantial savings.

With these crude estimates in hand we can already look into the achiev- ability of a quantum speed-up on realistic devices. We will compare per- formances to the special-purpose supercomputer “Janus” [110, 111] which consists of a massive parallel field-programmable gate array (FPGA). This system is capable of performing 1012 Markov chain spin updates per second on a three-dimensional Ising spin glass of size n = 803. A calculation that lasts a bit less than a month will thus realize 1018 Monte Carlo steps. On the one hand, assuming that the theoretically predicted quadratic speed- up holds and since the numerics show a constant factor around one, the quantum computer must realize at least 109 steps per month in order to keep up with the classical computer. This requires that a single step of the quantum walk be realized in a few milliseconds. On the other hand, the super-quadratic speed-up we have observed would allow almost a tenth of a second to realize a single quantum step, while the sub-quadratic speed-up would require that a single step be realized within 0.1 microseconds. Taking the circuit depth reported in Table 5.1 as reference for a three- dimensional lattice leads to a circuit depth of log(803) × 26 ≈ 1000. To avoid harmful error accumulation, the gate synthesis accuracy e should be chosen as the inverse volume (circuit depth times the number of qubits) of the quantum circuit, roughly e−1 ≈ 803 × log(803) × 109 ≈ 1016, so on the order 1 of 4 log e ≈ 200 logical T gates are required per fine-tuned rotation [31, 112], for a total logical circuit depth of 200,000. With these estimates, the three scenarios described above require logical gate speeds ranging from an unrealistically short 0.5 picoseconds (sub-quadratic speed-up), to an 5.5 irreversible parallel walk 65 extremely challenging 1 nanosecond (quadratic speed-up), and allow 0.5 microseconds (super-quadratic speed-up). We could instead compile the rotations offline and teleport them into 1 the computation [113], which requires at least 4 log e ≈ 200 more qubits, but increases the time available for a logical gate by the same factor. Under this scenario, the time required for each logical gate would range from 0.1 nanoseconds (sub-quadratic speed-up), to 20 microseconds (quadratic speed-up), and to 1 milliseconds (super-quadratic speed-up). These esti- mates are summarized in Table 5.2. The latter is a realistic logical gate time for many qubit architectures, while there is no current path to achieve nanosecond logical gate times.

Quantum speedup Synthesis online Synthesis offline Sub-quadratic x0.75 0.5ps 0.1ns Quadratic x0.5 1ns 20µs Super-quadratic x0.42 0.5µs 1ms

Table 5.2 Logical gate time required to outperform a supercomputer capable of realizing 1012 Monte Carlo updates per nanosecond in a computation that lasts one month. Arbitrary single-qubit rotations can be synthesized online or offline at an additional qubit cost.

Given the above analysis, if a quantum computer is to offer a practical speed-up, we conclude that a better understanding of the class of problems for which heuristic super-quadratic speed-ups can be achieved is required, and that we need to optimize circuit implementations even further.

5.5 irreversible parallel walk

Lastly, we presents an improved parallelized classical walk for discrete sparse optimization problems which could potentially lead to significant improvements on a quantum computer. Unfortunately this walk is not re- versible, which motivates a future generalization of Szegedy’s quantization to include irreversible walks.

For simplicity, suppose that the set of moves zi ∈ M consist in single-spin flips, but note that our algorithm can be extended to more general moves. 66 quantumwalks

We define a parallel classical walk with transition matrix

N (1−xi·yi)/2 (1+xi·yi)/2 Wyx = ∏ qBi(x) 1 − qBi(x) i=1

β[E(x)−E(x·zi)] with Bi(x) = min{1, e }

where zi is the move which consists of flipping the i-th spin. One of the primary advantages of this walk is that each step is fully parallelizable. However, in order to understand the suggested walk it is helpful to think of a step in terms of sequential updates: A single step of this walk can be decomposed into a sequence of updates consisting of suggesting to flip spin i with probability q and accepting the update with probability Bi(x). In contrast to the Metropolis-Hastings algorithm, the acceptance probability is always evaluated relative to the state at the beginning of the step, even though other spins could have become flipped during the sequence. To match the Metropolis-Hastings algorithm, we would instead have to evaluate the acceptance probability conditioned on all previously accepted moves in the step, i.e. we would have to substitute

(MH) tot β[E(x·zi )−E(x·zi)] Bi (x) = min{1, e } tot where zi is the bit mask indicating all accepted moves prior to suggesting the update for spin i. Note that for a local spin model with, e.g., nearest- neighbor interactions, the two acceptance probabilities only differ if a neighbor of site i has been flipped prior to attempting to flip spin i. The variable 0 ≤ q ≤ 1 is a tunable parameter of the walk. Because a transition on each spin is proposed with probability q, the probability of having two neighboring spins flipped is O(q2). Thus, we essentially expect a single step of this modified walk to behave like qn steps of the origi- nal Metropolis-Hastings walk, with a systematic error that scales like nq2. Moreover, this systematic error is expected to decrease over time since once the walk settles in a low-energy configuration, very few spin transitions will turn out to be accepted, thus further decreasing the probability of a neighboring pair of spin flips.

To verify the above expectation, we have performed numerical simula- tions on an Ising model H = ∑i,j Jijxixj where Ji,j were randomly chosen from {+1, −1}. Results are shown on Figure 5.3. What we observe is that for an equal amount of computational resources, the parallelized walk outperforms the original walk. This is true both in terms in reaching a quick 5.5 irreversible parallel walk 67 pseudo minimum configuration at short times and in terms of reaching the true minimum at longer times. Thus, even without any quantization such a heuristic it appears to be of interest on its own.

1000

0

-1000

-2000

-3000

-4000

-5000 0 5 10 15 -6000 104 -7000

-8000 100 101 102 103 104 105 106

Figure 5.3 Energy above ground state of an Ising model on a complete graph with n = 500 vertices with random binary couplings as a function of the number of Monte Carlo steps. Results are shown for regular Metropolis-Hastings walk and the parallelized walk with different values 1 1 1 1 of q = 1, 2 , 4 , 8 and 16 . The inverse temperature was set to β = 3, so the fixed point is a low energy state. Since each step of the parallelized classical walk requires n = 500 times as many gates as the original walk, the time label of the parallel walk has been multiplied by n to adequately represent the amount of computational resources. The parallel walk with q < 1 outperforms the original walk at long times (see inset with first 150,000 steps) and achieves similar performances at short times as q approaches 1.

In terms quantization, consider how to implement a quantization à la Szegedy. The operator W can easily be applied. First, use CNOT gates to copy the Right register onto the Le f t register, yielding state |xiL ⊗ |xiR. Then, sequentially over all spins i, apply a rotation to spin i of the Le f t register conditioned on the state of the spin i and its neighbors on the Right 68 quantumwalks

p p register. This rotation transforms |xii → 1 − qBi(x)|xii + qBi(x)|xii. Note that the function Bi(x) only depends on the bits of x that are adjacent to site i, so this rotation acts on a constant number of spins and therefore requires a constant number of gates. Thus, the cost of the classical and the quantum parallel walks have the same scaling in n even for a non-sparse model. Combined to its observed advantages over the original classical walk, the parallel walk thus appears as the ideal version for a quantum implementation.

Unfortunately, the parallel walk is not reversible; it does not obey the detailed-balance condition Eq. (5.1). Thus, it is not directly suitable for quantization à la Szegedy. While quantization of non-reversible walks were considered in Ref. [99], they require an implementation of time-reversed Markov chain W ∗ defined from W and its fixed point π as

∗ Wxyπx = Wyxπx.(5.14)

We do not know how to efficiently implement a quantum circuit for the time-reversed walk W∗, so at present we are unable to quantize this parallel walk. 6 QUANTUMPROGRAMMING

This chapter contains some content from Publication II, as well as content that has been submitted to Nature Review Physics.

Taking a step back and looking at the available quantum hardware today, clearly we face the challenge of figuring out how to gradually build and scale both hardware and software in a way that allows us to realize the large scale applications such as the ones discussed in the previous chapter. Quan- tum hardware today has relatively few qubits compared to the number of logical qubits one would need to execute even a small molecule simulation that can’t easily be done on classical hardware. Current noise levels fur- thermore would make such a computation infeasible in the sense that the executed computation would likely be incorrect or imprecise. Handcrafting a thoroughly optimized implementation for a particular problem instance and hardware target can yield better results than automatically optimized user code containing more abstractions. This requires that the language and expose a high degree of control over how code is translated into hardware instruction. Conversely, code that is written specifically for a particular hardware target is less reusable. Enabling code reuse may be less of a priority early on, with current coherence times limiting the size of programs. Nonetheless, it is important not to loose sight of how we ultimately envision a quantum architecture to grow and what that means for building its stack and programming it.

As we have seen in the last chapter beyond scaling both hardware and software, we need equally as much advances in terms of algorithm and application development. To achieve this, we need to examine carefully how we develop quantum applications and what tools we need to gather accurate insight into how theoretical concepts translate into realistic appli- cations. This chapter therefore highlights important aspects of quantum programming and how it differs from conventional programming.

69 70 quantum programming

6.1 domain-specific concepts

There are several aspects that are common to both conventional and quan- tum programs, and correspondingly several areas in language design and compiler heuristics where the quantum computing community can profit immensely from the experience gained from conventional computing over the past decades. There are also unique concepts and possibilities that have no classical analog and are germane to quantum computing. We review some of these domain-specific concepts in the following.

commuting operations When determining dependencies between quantum instructions, one should ideally account for the fact that even if two instructions act on the same block of quantum memory, the order in which they are applied does not matter as long as they commute. Even if two quantum instructions do not commute in the mathematical sense, it is possible that the error accumulated due to the non-zero commutator merely constitutes a trade-off rather than a decrease in the overall quality of the computation. While determining bounds on accumulated errors during the computation is certainly within the domain of quantum algorithms research, it stands to reason that for practical purposes heuristics for determining the required precision with which each program piece needs to be executed might perform reasonably well [114]. One might therefore wonder whether carefully chosen language constructs to express these properties could aid in developing optimizations that exploit such quantum-specific phenomenon.

clean and borrowed qubits A particular use of quantum memory is referred to as “borrowed qubits”; qubits that are already in use and may be entangled with other parts of the quantum computer, but are simultaneously used for intermediate computations [27, 115]. A quantum subroutine that makes use of such qubits must guarantee that, after its completion, the borrowed qubits are in the same state as prior to its execution. Even for “clean qubits” which refers to qubits that are allocated and re- turned in a known state, the process of releasing allocated quantum memory is not as simple as overwriting it with new content. While it is physically possible to reset individual qubits by measuring their state, quantum al- gorithms frequently harness quantum entanglement for computations. In an entangled state, observing part of the state has an immediate impact 6.1 domain-specific concepts 71 on the state of the remaining system. Due to this particular quantum cor- relation, releasing allocated memory generally requires disentangling the corresponding qubits from the remaining system. Since the entangling computation corresponds to a unitary transformation, applying its adjoint has the desired effect. adjoint operations A common pattern in quantum programming is hence the need to apply such an adjoint transformation to reverse the effects of a computation. For optimization purposes, it is useful to preserve the association between a transformation and its adjoint, i.e. the transformation that reverses its effects. That information can be used, for example, to determine an execution order that reduces computation costs by omitting transformations that are immediately followed by their adjoint since both their effects cancel. While an adjoint can be defined for any unitary transformation, this is usually not the case for any transformation involving measurements. Even though the effects of a measurement involving several qubits cannot easily be undone, it is possible to combine different measurements in a way that the overall transformation of the state is unitary [116]. It is barely possible to detect such a pattern automatically short of performing a complete symbolic evaluation of the program, and suitable abstractions and language tools are necessary for exploitation in this case [117]. Since at a physical level, measurements are one of the most expensive instructions, reducing the number of measurements by detecting and optimizing such patterns can save on precious coherence time budgets. controlled operations Another frequently occurring and potentially computationally intensive pattern is the use of controlled unitary transformations. Executing a con- trolled transformation means that the transformation is applied conditional on the control qubits being in a certain state. They are, so to speak, the quantum version of an if-statement, where both branches can be executed simultaneously if the control qubits are in a superposition; if the control qubits are in a superposition of states, then the result is a superposition of both applying and not applying the operation. For larger transformations, the controlled version is usually composed incre- mentally based on the contained sub-transformations and their controlled versions. Reordering of parts may allow for building a more favorable composition. Preserving the information about the occurrences of such 72 quantum programming

patterns as abstractions in a makes code more understandable and maintainable, allows for automation, and may facili- tate optimizations on future large-scale processors. Similar considerations may hold for other constructs, such as for example applying the same transformation multiple times.

6.2 quantum algorithms

The fact that 2n complex amplitudes are required to describe a pure state of n qubits lays at the heart of many quantum algorithms. The sheer amount of information contained in a quantum system is incredible; the number of am- plitudes required to describe a system of less than 300 qubits is larger than the currently estimated number of atoms in the visible universe! While that amount of information of course cannot easily be extracted, or converted to and from classical data, the idea behind many quantum algorithms is to leverage an innate massive parallelism for the actual computation, thanks to quantum operations acting on n qubits being capable of simultaneously transforming all 2n amplitudes.

To achieve dramatic quantum speedups, the basic idea of many quantum algorithms is to identify and construct a solution for problems that

1. requiring only a small amount of in- and output, but require compu- tations on a large configuration space, 2. can take advantage of subroutines with a large quantum speedup over the best-known classical algorithm, and 3. allow to leverage interference to sample from a target output distribu- tion.

Quantum algorithm design is both challenging and rewarding, with certain quantum algorithms even achieving exponential speedups over their classical counterparts [118]. Common quantum subroutines to achieve speedups include amplitude amplification for increasing the amplitude of a desired state in a , quantum phase estimation for estimating eigenvalues of a unitary operator, and the quantum Fourier transform for performing a change of basis analogous to the classical discrete Fourier transform. The efficiency of the quantum Fourier transform (QFT) far surpasses what is possible on a classical machine making it one of the tools of choice when designing a quantum algorithm. 6.2 quantum algorithms 73

Many algorithms require implementation of a quantum oracle for function evaluation. Reading and storing classical input data requires time linear in the amount of data and is often done through an oracle. If a quantum algorithm achieves a quadratic computational speedup over its classical counterpart, the speedup may be lost once the oracle implementation is accounted for due to the required linear input time, thus the oracle imple- mentation and optimization is especially important. Classically, an oracle is a boolean function mapping an n-dimensional boolean input to an m- dimensional boolean output. A classical boolean oracle can be converted into a quantum oracle by increasing the input and output spaces from n and m bits respectively to n + m qubits each, enabling representation as a unitary matrix. Example oracles include arithmetic functions, graph functions, and lookup tables.

Quantum algorithms by design are built by combining and adapting elementary concepts such as the ones mentioned in the previous section. Language design for quantum programs should therefore address the com- mon building blocks of interest for automation and optimization purposes. One might argue that it is sufficient to consider intrinsic hardware instruc- tions as common building blocks, and that answering the question of how to best combine those should remain within the field of quantum algorithms research. This perspective is particularly tempting considering that in order to optimize approximation errors the same mathematical transformation of a quantum state may need to be translated into different hardware instruc- tion depending on its context. However, this case-by-case approach for each application and targeted hardware seems like a rather daunting task even if it is to be expected that the range of quantum applications will probably be more narrow than in conventional computing. One may wonder whether reasoning about more abstract concepts would allow for a significant advantage in scalability both in terms of program size and development effort. Looking at ubiquitous quantum subroutines such as amplitude amplification [119], phase estimation [120], and quantum Fourier transformation [121], the same functionality can often be executed in different ways that are particularly performant with regards to, e.g., memory requirements, runtime, or robustness to errors. Moving from the primitive instructions defined in Chapter 1 to higher level concepts in terms of which a quantum program is formulated permits to dynamically adapt the executed instructions depending on the context in which a subroutine 74 quantum programming

is invoked.

The above observations lead to a more general principle: Designing a quantum programming language that captures sufficient information to optimize at different levels of abstraction can open the gate to harnessing a wealth of knowledge accumulated on quantum algorithms well before large- scale hardware is available. In combination with suitable tools to evaluate various performance metrics this facilitates and fast tracks the development of compiler heuristics that will extend the range of applications that can be run using a given amount of resources.

6.3 quantum model of computation

Since only some hard computational parts of a program are quantum mechanical in nature, a natural model for quantum computation is to treat a quantum computer not as a full-fledged computation device in its own right - replete with mechanisms for storage, network communication, user interaction and so on -, but rather as an external, adjunct, co-processor to a classical machine. Treating the quantum computer as a coprocessor is similar to how GPUs, FPGAs, and other adjunct processors operate. In such a model, the quantum computation happens on the co-processor in a cleanly separated domain, and indeed becomes a domain covered by a variety of domain-specific quantum programming language (DSLs). A host program calling into such a DSL can be executed against a simulator which provides an implementation for each of the primitive quantum gates, or translated to primitive physical operations which then run against appro- priate hardware. The primary control logic runs classical code on a classical “host” computer. When appropriate and necessary, the host program can in- voke a sub-program that runs on the adjunct quantum processor. When the sub-program completes, the host program gets access to the sub-program’s results. In this model, there are three levels of computation: • Classical computation that reads input data, sets up the quantum computation, triggers the quantum computation, processes the results of the computation, and presents the results to the user.

• Quantum computation that happens directly in the quantum device and implements a quantum algorithm.

• Classical computation that is required by the quantum algorithm during its execution. 6.3 quantum model of computation 75

There is no intrinsic requirement that these three levels all be written in the same language. Indeed, quantum computation has somewhat dif- ferent control structures and resource management needs than classical computation, and using a custom programming language allows common patterns in quantum algorithms to be expressed more naturally. This is particularly appealing as many quantum algorithms require intermediate classical computations during the execution of the algorithm. Alternatively, an embedded language brings with it all of the functional- ity of the host language. While this has some clear advantages it also has disadvantages when attempting to automatically perform meta operations, such as reversing a quantum transformation; Without symbolic computing, the only way to produce the adjoint of a transformation is to explicitly write it as part of the code-base. Combined with control flow, its symbolic computation on the other hand requires dealing with the full complexity of the host language.

The dominant approach to programming quantum computers is to pro- vide an existing high-level language with libraries that allow for the ex- pression of quantum programs. This approach can permit computations that are meaningless in a quantum context; prohibits succinct expression of interaction between classical and ; and does not provide im- portant constructs that are required for quantum programming. It is hence helpful to think about and distinguish three possible levels of integration between classical and quantum computations:

integration at the code level Embedding into an existing classical language in a sense can be seen as an integration at the code level. A host program generates a sequence of instructions that subsequently are to be executed by the quantum processor. All classical computations are evaluated ahead of time such that the handed off sequence can be expressed as a single quantum circuit. The evaluation of classical computations ahead of time is only possible if it does not depend on computations that are executed by the quantum processor. Within this model of integration, communication between the classical host computer and the quantum coprocessor is limited to a one-time exchange during which the instructions are handed off to the copro- cessor that then responds back to the host computer with the results. Hence, only a limited set of quantum applications can be executed in this way. In particular, programs that would require computations on 76 quantum programming

the host computer while the quantum state remains coherent can’t be executed within this model. integration at the compiler level An integration at the compiler level is achieved via transformation into an intermediate language (IL) that represents both quantum instruc- tions and classical control flow. Only a subset of classical computation are executed ahead of time, and any control flow that depends on measurement outcomes of (part of) the quantum state is expressed in terms of that IL format. This requires that the coprocessor provides limited classical compute resources in addition to quantum resources. Within this model of computation, it is possible for the quantum pro- cessor to execute or call out to perform classical computations while the quantum state remains coherent. The quantum instructions that are to be executed may depend on the evaluation of classical computations. However, the computation to execute is fully compiled and optimized prior to the hand-off to the coprocessor. integration at the runtime level Rather than handing off the complete program to execute to the quan- tum processor, an integration at the runtime level allows for an ongoing communication between the host computer and the coprocessor during execution. The communication takes place while a potentially complex classical program state persists and qubits remain live, i.e. available and containing information related to the ongoing computation. Repeated back-and-forth communication takes place opposed to the largely one- time communication for the other integration levels. Within this model of integration, it is possible that for each branching based on intermediate results of the quantum computation, classical computations generate the next sequence of quantum instructions and just-in-time compilation is performed to optimize them prior to continuing the execution.

A quantum programming language ultimately needs to achieve ease of expression, compilation, and optimization of quantum algorithms working in concert with classical computation.

6.4 quantum software frameworks today

Quantum programming languages and software framework today serve a variety of purposes. The intended purpose heavily impacts the design of the 6.4 quantum software frameworks today 77 language along with other common properties known from conventional programming languages such as their paradigm or computation model.

A variety of quantum programming languages has started to emerge over the last couple of years, ranging from imperative to functional and low-level to high-level [122]. Most of these languages are intended as circuit description languages. They commonly offer powerful and extensible facilities for quantum circuit description and manipulation, including gate composition and decomposition, circuit optimization, and exporting of quantum circuits for rendering or resource costing purposes. However, many quantum algorithms may employ patterns and constructs that are not readily expressible in circuits. Constructs such as classical control predicated on the result of a quantum measurement are difficult to express as circuits, while recursion and unbounded iteration are often impossible. Several software frameworks such as LIQUi|i [123] and ProjectQ [124] support higher-level functions. They are implemented as circuit transforma- tions; i.e. functions that take one or more input circuit(s) and produce an output circuit. This prevents modeling, for example, repeat-until-success algorithms and other algorithms with non-trivial branching. This limitation can be partially mitigated by including some kinds of classical feedback as circuit elements, as is done at a high-level by LIQUi|ior ProjectQ and at a low-level by OpenQASM 2.0 [125]. Even so, including classical feedback directly in circuit representations is often impractical for interacting robust classical algorithms with quantum processing. With most of these software frameworks being embedded into a rather expressive and unwieldy classical language, optimizing the program structure beyond what is expressed as a pure quantum circuit can be challenging. Such an approach hence limits the utility for adaptive characterization or for hybrid quantum–classical computations.

In this section, we illustrate different use cases of today’s quantum pro- gramming languages and how they are typically supported by tools such as simulators and resource estimators as well as ecosystems such as libraries, documentation, and learning materials. In Chapter 7, we then illustrate several concrete quantum programming languages and explain how they relate to these principles. 78 quantum programming

6.4.1 Use cases

Given the steady evolution of quantum hardware, one of the use cases of quantum programming languages is to express programs intended for execution on a variety of different quantum computing devices, ideally inde- pendent of the underlying technology, such as superconducting qubits [126, 127], quantum dots [128], ion traps [129], and topological qubits [130]. As a result of this variety of platforms and goals, there is similarly a difference in the applications targeted by the different hardware efforts. For instance, some efforts are focused more on short-term applications, while other ef- forts are focused on building fully scalable solutions.

Near-term applications, often termed noisy intermediate-scale quantum computing (NISQ, [131]), can typically not afford quantum error correction. As a result, optimizations favor single-qubit operations over two-qubit oper- ations and do not distinguish between Clifford and non-Clifford operations. As an example, a T gate has the same cost as a Z gate whereas a controlled- NOT operation is significantly more costly. This is in stark contrast to error corrected devices that distill T gates to increase their accuracy [36], causing them to be much more expensive. Further, the resources in near-term quan- tum computers are severely limited, i.e., only a few qubits are available, and the noise due to the lack of error correction limits the number of gates that can be executed before measurement results are essentially random. Every gate and every qubit counts, and quantum programs targeting this level are therefore expressed in terms of explicit gate-level operations, as a higher level of abstraction may lead to a unaffordable cost overhead.

In contrast to the NISQ regime, scalable applications assume fault-tolerant quantum computers with a large number of qubits. Due to the requirement of [36] for non-Clifford operations, the set of basic operations is typically limited to just a few Clifford gates and one or two non-Clifford gates. The large number of qubits make it infeasible to address the quantum computer at the gate level and therefore enable higher levels of abstraction. These are in particular useful, since quantum programs targeting fault-tolerant quantum computers are expressed in terms of logical qubits, which requires a further (automated) compilation step to the actual physical operations and interaction with physical qubits on the quantum computer. 6.4 quantum software frameworks today 79

Classical feedback while the quantum state remains coherent can be con- sidered a critical requirement for a scalable quantum , but is challenging to implement on experimental platforms. We therefore currently see a dichotomy between languages such as PyQuil/Quil [132] and Q# [133], which both include classical feedback as fundamental primitive, and languages such as OpenQASM [134, 135] and Cirq [136], which include limited or no classical feedback.

6.4.2 Tools

Having tools that facilitate the discovery and advancement of quantum algorithms even before hardware exists that is capable of executing them is paramount for the success of quantum computing. Beyond enabling execu- tion of quantum programs, quantum programming languages frequently serve as tools for algorithm development, verification, and debugging. For that purpose, being able to analyze or simulate a quantum program on classical hardware is often far more convenient and informative than the actual execution on quantum hardware would be. Programming languages primarily geared towards enabling the discovery and analysis of new quan- tum algorithms hence cater to a very different set of requirements than languages that are primarily intended for running known algorithms on quantum hardware.

Simulation addresses the task of simulating a physical quantum computer on a classical computer. This is a challenging task, particularly when the full state of the quantum computation needs to be captured, which requires memory resources that are exponential in the number of qubits. If one is interested in simulating a computation without accounting for hardware related errors during execution, the simulation can readily be performed on a standard computer, such as a laptop, for small computations involving up to around 30 qubits. Current may be able to simulate up to around 50 qubits [137], or even more if the computation doesn’t require that the full state vector is tracked throughout the execution. Things look much more dire on the other hand if the goal is to simulate the execution under general noise, described by Kraus operators, which generally requires tracking not only the state vector but in fact the full density matrix. Wave- function Monte Carlo techniques [138] provide a less memory intensive alternative, at the cost of increasing the computational effort by tracking ensemble trajectories. Simulations are a valuable tool for validating whether 80 quantum programming

the algorithm and implementation works as expected. Depending on the application and quantum algorithm, some simulation tasks can be much simpler. A quantum operation that describes a permutation of the state vector, e.g., in reversible operations that describe quantum arithmetic, can be simulated using a Toffoli simulator requiring memory resources that are linear in the number of qubits. Similarly, a CHP simulator, sometimes also known as a stabilizer simulator, may benefit quantum error correction studies [139]. When targeting a classical simulator that records the full state vector, it may be useful to allow quantum algorithms access to PostBQP resources by allowing quantum programs to directly query counterfactual measurement probabilities. Simply put, it may be useful to allow the user to programmat- ically access amplitudes for algorithm research purposes. Programs making use of any such capabilities can not be executed on quantum hardware. Short of allowing user programs full access to postselection resources, a simulation framework can offer other capabilities to user programs that can be more effectively stripped out when targeting hardware. For instance, ProjectQ [140] allows user programs to specify emulation logic, allowing for larger examples such as Shor’s algorithm [121] to run efficiently even with modest classical hardware. Similarly, Q# allows user programs to specify unit testing assertions in terms of strong classical simulation resources; these assertions can then be safely removed during execution on targets which do not provide strong classical simulation.

Resource estimation is the task of costing a quantum operation in terms of the number of primitive gate operations. This task is similar to simulation, but requires significantly less memory resources. A simulator for resource estimation simply needs to update corresponding counters when executing the quantum program. Further, since the semantics of the program is irrel- evant for resource estimation, program transformations may significantly speed up this process. Recent analyses have shown the importance of quantum programming languages to cost estimation [141], as much more accurate numbers can be obtained by explicitly counting the instructions invoked by a partic- ular quantum program than by relying on first-principles bounds or on asymptotic analysis of costs. Moreover, cost estimates derived from runtime observation of quantum programs can also incorporate information from an entire runtime stack, including error correction and layout. 6.4 quantum software frameworks today 81

Verification addresses the problem of proving the correctness of a quan- tum program using static analysis, i.e., without executing the program, e.g., with a simulator. One can either verify whether the program behaves correctly according to a specification, or whether two different programs for the same algorithm are semantically equivalent. The advantage of verifi- cation over simulation is that it may require fewer memory resources, by performing analytically rather than explicitly representing the computation in terms of full state vector, and that it can be applied to parameterized quantum programs, e.g., in the number of qubits, and therefore is not limited by it. Quantum programming languages for verification require a well-formalized semantics that can be understood by a theorem prover. Most quantum programming languages targeting verification are based on an interactive theorem prover. Here, the developer is required to prove the correctness in terms of a sequence of proof steps that can be verified by the theorem prover. Automatic theorem provers to verify the correctness of a quantum program exist, however, these are often not scalable due to the high underlying computational complexity of the verification problem. There are efforts to integrate quantum programming with such theorem provers; QWire [142] for example is embedded in the proof system Coq. In Section 7.2.4 will look in more detail into the Quipper programming language that has served as inspiration for many such effort. However, we won’t discuss quantum languages purely intended for verification pur- poses, and refer the reader for more on quantum program verification to the literature (see, e.g., [143–145]).

6.4.3 Ecosystems

A programming language is but one element of a broader ecosystem that enables developers to construct and express concepts and algorithms pro- grammatically. Programming in any language or context can be a challeng- ing task. Much of the skill, art, and trade of modern classical programming are obtained from using tools that facilitate development. A developer may rely on features such as autocomplete or signature help to avoid frequent references to documentation, or may rely on pre-existing libraries to avoid re-implementing common logic. To further ease the developer’s role and task, appropriate tooling and integration is required for a seamless exe- cution and debugging experience. Therefore, another important aspect of quantum programming is the ecosystem, which includes libraries, develop- ment environments, learning materials, and online access to physical and 82 quantum programming

simulated quantum computers through the cloud. Most of these aim to ease the introduction to both quantum computing and quantum programming, since one of the primary difficulties in getting involved with quantum computing research or development can be the working understanding of quantum computing concepts.

Libraries enable high levels of abstraction by providing generic imple- mentations of often used quantum algorithms such as a Grover search or quantum phase estimation, or application specific libraries, e.g., for quantum chemistry or learning. In addition, they allow to create larger applications through composition. A thriving community developing such libraries provides a support network that fosters novel ideas that combine the knowledge of experts across different areas. Open source code can also serve as source for learning material by means of reference implementations for common functionality.

Other learning material includes static documentation [136, 146, 147] or interactive coding tutorials, e.g., by means of Jupyter Notebooks [148] or Katas [149]. Jupyter Notebooks are interesting for learning quantum programming languages, since they enable to visualize concepts through graphics and plots, which are essential to understand quantum algorithms and their execution results. Another advantage of Jupyter Notebooks is that they can be executed in a browser and therefore relieve the user from the requirement to install additional software. Code Katas are an alternative, in which the learner is provided with a sequence of small exercises, each adding just a little more complexity. In the context of quantum program- ming, it can teach quantum computing concepts by expressing them in a quantum programming language, thereby learning both quantum comput- ing and how to express it in a program. Katas are implemented in a way such that they can be used in a self-assessment setting, where the learner gets immediate feedback whether the exercise was solved successfully or not. In the latter case, hints may be provided to master the exercise.

Integrated development environments are an important tool for program- mers. Standard features include syntax highlighting and code completion, which can simplify the editing of code. Further, warning and error messages are not only useful to debug code, but also help in learning a new program- ming language. Finally, more advanced concepts such as code actions, can 6.4 quantum software frameworks today 83 provide hints to the developer and introduce unknown capabilities of the programming language. Access to physical quantum computers is provided through the cloud via a multitude of providers, see [150–156] to give just a few examples. Developers can access these through online editors, REST APIs, or APIs integrated in the programming language. The limited number of available quantum computing devices and their early stage of development reinforces the merit of tooling to analyze and verify a quantum program prior to execution on quantum hardware. 7 QUANTUMPROGRAMMINGLANGUAGES

The content of this chapter has been submitted to Nature Review Physics and is currently under review.

Quantum programming languages are used for controlling existing phys- ical devices, for estimating the execution costs of quantum algorithms on future devices, for teaching quantum computing concepts, or for verifying quantum algorithms and their implementations to name just a few among the multitude of purposes. They enable newcomers as well as seasoned practitioners, researchers and developers working on the next ground break- ing discovery or applying known concepts to real-world problems. The great variety in purpose and target audiences is reflected in the design and ecosystem of existing quantum programming languages, depending on which factors a language prioritizes.

This chapter reviews a selection of several state-of-the-art quantum pro- gramming languages, highlighting their salient features, and provides code samples for each of the languages.

7.1 purpose of programming languages

As the effort to produce robust, enterprise-scale quantum computing hard- ware advances [157, 158], and as new quantum algorithms are being de- veloped [159–161], the quantum computing community is faced with a challenging task. In order to realize the potential offered by quantum com- puting, we must be able to program and control quantum devices such that they execute algorithms to solve valuable problems [162, 163]. Making programming efficient and seamless to the developer requires automating tasks like circuit layout and gate synthesis [34, 164] that are currently often done manually on a case-by-case basis. Classical and quantum computa- tions are often interspersed in quantum applications [165–167], and the inherently asynchronous nature of quantum subroutines can make the task of coordinating between the two a challenging endeavor. Moreover, after decomposition into an instruction set that is formally supported by the targeted hardware, a complex microarchitecture is responsible for execution.

84 7.1 purpose of programming languages 85

Providing the required information for automation and execution requires the development of suitable programming languages and compilation proce- dures. Tackling real-world problems of interest requires combining different quantum subroutines and algorithms into an application that solves the problem end-to-end. Transitioning from developing a toolset of quantum algorithms and subroutines to enabling the construction of applications is predicated on the availability of suitable programming languages and frameworks.

For the design of quantum applications, the choice of programming language has implications far beyond what tools are available for express- ing a quantum program. More than a matter of convenience, language design choices can have a significant impact on the costs of developing, running, and maintaining an application. To illustrate the consequences and evolution of language design choices over time, consider for example the introduction of null in classical languages. It has been criticized to the point of having been called a “billion dollar mistake” for entailing the need for null-checks throughout the codebase, thus increasing engineering costs [168]. The ever ongoing effort to improve usability and productivity of popular classical languages has hence prompted the adoption of features to revise some of detrimental consequences in languages as disparate as C#, Python, Rust, and JavaScript. Constructs that are available within a language determines how we think and reason about the implemented algorithms. Therefore, the choice of pro- gramming language affects how we conceptualize a problem [169], ideally promoting expressing programs in a way that facilitates an efficient execu- tion, rather than encouraging unnecessarily convoluted implementations that are hard to optimize. This is especially important in quantum program- ming, where even the largest of devices in the medium term will be heavily constrained in terms of logical qubit count and coherence times, and may place heavy demands on timescales for classical feedback. At the same time, some of the conceptual algorithmic techniques such as amplitude amplification or phase kickback are ideally supported by the language, so that the programmer can think in terms of the techniques rather than the underlying elementary quantum operations. This in turn can significantly expand the range of applications for quantum computing. The design of quantum programming languages thus not only must draw from the wealth of experience we have in designing classical languages, but must address 86 quantum programming languages

unique challenges posed in the quantum domain.

In this chapter, we review some of the quantum programming languages that are currently available, exploring different purposes, use cases, and the assets of each language. Before talking about different approaches and about the capabilities of each language and supporting tools and ecosystems, it is helpful to expand on the fundamental question regarding how we want to reason about quantum algorithms. We thus first proceed to discuss some commonly used patterns in quantum programs, and illustrate some concepts and challenges that are unique to quantum computing.

7.2 languages and ecosystems overview

We discuss some of the quantum programming languages, along with the tools and ecosystems around them, that are currently in use and/or have served as a foundation for current and future work. We do not aspire to be exhaustive in our analysis but instead cover a representative set of languages with distinct priorities and focusing on different paradigms. Our selection serves as illustration for the implications and distinguishing features that follow from prioritizing certain use cases when designing quantum programming languages. These choices include decisions such as prioritizing execution on currently available hardware versus future fault-tolerant devices, enabling extensive program verification or striving to facilitate the creation and use of performant simulation tools on classical computers.

We choose to elaborate on these topic by examples, including the lan- guages Q#, Qiskit, Cirq, Quipper, and Scaffold. Our selection is not a judgement on the importance or popularity of individual languages; some widely used languages that are referenced below are not discussed in detail. Instead, we discuss a selection of languages that we hope captures unique facets we believe are of interest to the reader. The selected languages demon- strate the difference in design and ecosystem depending on the intended purpose and target audience. Without diminishing the multi-faceted capabilities of individual lan- guages, we chose these five language for the following reasons: Q# is focused on supporting large scale quantum applications, while Cirq is primarily geared towards NISQ devices. OpenQASM has served as an intermediate representation that is supported by a variety of hardware and 7.2 languages and ecosystems overview 87

software providers. Quipper has served as the foundation in the field of quantum programming language research and inspired the development of formal verification techniques, while Scaffold with its LLVM-based compiler ScaffCC has spawned discussions about quantum programming within the classical compiler research community.

A quick overview over some of the discussed features is assembled in Table 7.1. It should be noted that all languages, , and tools associ- ated in this selection are open source. For further reading about quantum programming languages we point the reader to the review article [170], and conference and workshop reports series such as [171].

Q# Qiskit Cirq Quipper Scaffold

Invocation Standalone, Embedded Embedded Embedded Standalone usable from into Python into Python into Haskell1 Python, C#, F# Classical feedback yes yes2 no yes yes3 Adjoint generation yes yes yes yes no Resource estimation Gate counts, Gate counts, Gate counts, Gate counts, Gate counts, number of qubits, number of qubits, number of qubits number of qubits, number of qubits, depth & width, depth & width depth & width depth4 call graph profiling Libraries Standard, Standard, Standard, Standard, Standard5 chemistry, chemistry, chemistry, numerics numerics, optimization, ML ML finance, QCVV, ML Learning materials Docs, Docs, Docs, Docs6, Tutorials7 tutorials, tutorials, tutorials tutorials katas textbook

Table 7.1 Overview of some of the discussed features of the languages surveyed in this chapter.

1 Standalone versions such as Proto-Quipper-S and Proto-Quipper-M are proposed or under development, see also Section 7.2.4. 2 Some restrictions apply regarding allowed types and language constructs in OpenQASM branching statements. 3 However, see relevant GitHub issue [172] regarding code generation for classical feedback. 4 Resources estimation includes different flavors of error correction, see e.g. [173]. 5 See [174] for the current selection of implemented algorithms. 6 Online API documentation available at [175]. 7 Tutorials and manual available at [176, 177]. 88 quantum programming languages

As a small, yet illustrative, example we express in each language, a protocol in which one sends the state of a qubit to another qubit by using pre-shared entanglement and classical communication as resources [178, 179]. While teleportation is too simple to fully capture the strengths and weaknesses of each software framework, it nonetheless serves as a first impression to introduce the language. As part of our discussion, in this chapter we provide the code that implements the sample in each lan- guage. We refer to the corresponding documentation (listed as references) for instructions on how to install the tools for each framework and how to execute the given pieces of source code.

A detailed discussion of a broader spectrum of quantum programming languages and open source software frameworks such as Forest/PyQuil [132, 180], ProjectQ [124, 181], QWIRE [142, 182], staq [183, 184], Strawberry Fields [185, 186], t|keti [187, 188], XACC [189, 190], or QuTiP [191–193] was out of scope of this thesis. However, many of the strength of the languages that we highlight in this chapter also apply to these languages and software frameworks.

7.2.1 Q#

Q# is a hardware-agnostic quantum programming language developed with the goal to enable execution of large-scale applications on future quantum hardware [133]. Correspondingly, Q# is focused on providing high-level abstractions that facilitate reasoning about the intended functionality rather than following the imperative style encouraged by assembly-like languages.

In contrast to, e.g., Python-based languages, Q# is strongly and statically typed. All types follow value semantics, including arrays [194]. It supports the constructs commonly provided for immutable data types, such as the means to construct a new array based on an existing one [146]. This restric- tion has the benefit that all side effects 8 of a computation have to be of quantum nature, meaning they can only impact the quantum state; there are no language construct in Q# that can modify classical values which are accessible outside the operation or function in which they are declared. The only way for side effects to impact the program flow is hence to have a

8 A side effect in programming is an effect that modifies the program state outside the local environment. 7.2 languages and ecosystems overview 89 branching based on the result of a measurement outcome. In combination with the fact that measurement outcomes are represented by a dedicated type, this allows to restrict how local computations can impact the program flow via the type system if needed, e.g., to accommodate current hardware limitations.

A salient feature of Q# is that it supports expressing arbitrary classical control flow [195]. This is in contrast to other quantum programming lan- guages where this capability is often provided by a classical host language. The representation within the quantum programming language itself per- mits developers to reason about the program structure at the application level. This allows to integrate, e.g., computing precision requirements for rotation synthesis [32], or and layout on the quantum chip for future large-scale applications. Q# distinguishes between operations and functions. Both are first-class values that can be freely assigned, passed as arguments to or returned from other operations and functions [133]. Functions are purely classical and deterministic in nature. As a consequence, functions can be fully evaluated as soon as their input is known. Operations, on the other hand, may contain arbitrarily interleaved classical and quantum computations, including allocations and deallocations of quantum memory. Unlike other quantum programming languages geared towards support- ing formal verification, qubits are treated like any other data type in Q#. Some languages, such as Proto-Quipper-M [196], use a linear type system to enforce the no-cloning property of quantum states. By contrast, Q# treats qubits as virtual entities of quantum memory. It thus takes a purely op- erational perspective, and has no notion of a quantum state within the language itself [197]. In addition to abstractions such as type parameterization and user de- fined types that primarily serve the purpose of user convenience and code robustness, Q# defines constructs that facilitate representing and leverag- ing certain quantum-specific patterns for optimization. Examples for such constructs are borrowing of qubits [27], conjugations representing patterns of the form UVU† [146], and functors [146]. Functors can be seen as higher order bijective meta-functions that associate quantum transformations that have a certain relation to each other. The Adjoint functor for example maps a unitary quantum transformation to its inverse. Which set of functors an operation supports is reflected in its type. The captured relations can be 90 quantum programming languages

exploited for optimization purposes.

Q# is compiled in a stand-alone manner, making whole program analysis more tractable. Q# offers interoperability with Python and .NET languages such as C# and F#. Its nature as a stand-alone programming language makes it easier to define a natural representation for quantum programs, as it is not constrained by choices made in the host language. This comes at the cost of not being able to leverage the rich set of existing tools for popular classical languages such as Python. Q# comes with its own set of tools, including, e.g., support for Jupyter Notebooks [146, 148], and an implementation of the Language Server Protocol [198] for providing semantic information to editors. Microsoft provides two Integrated Develop- ment Environments (IDEs) extensions for Q#: Visual Studio Code, which is supported on macOS, , and Windows, and Visual Studio for Windows.

A rich set of samples, libraries, tutorials, and katas exist around Q# [146, 199]. In addition to domain-specific libraries on chemistry, machine learning and quantum arithmetic, the standard libraries offer an arsenal of tools. Each contained callable and type is extensively documented in the code. That information is displayed to the user via IDE tools. The corresponding API documentation is generated for each release, and complements the doc- umentation on the language, tools, and quantum computing concepts. Katas and other teaching materials [199], designed to learn about both quantum computing and Q#, facilitate entering the field of quantum computing. The associated libraries, the Q# compiler, and all other components of the Quantum Development Kit are open source. Several NuGet packages are currently distributed, including a package containing tools for simulation and resources estimation. Microsoft has partnered with hardware providers to offer a service for executing Q# code on quantum hardware as part of the cloud-based Azure Quantum service [151]. This comes at the caveat that only a limited subset of Q# programs can be executed on quantum hardware, since current devices do not provide sufficient control flow capa- bilities to execute everything that can be expressed in Q#. At the time of writing, Honeywell, IonQ, and QCI are announced hardware partners [151].

Program 7.1 contains a Q# implementation of quantum teleportation. Our example is written as unit test. Executing the command “dotnet test” in the application folder will run the test. 7.2 languages and ecosystems overview 91

Program 7.1 Q# code implementing teleportation of an arbitrary quantum state.

1 namespace Microsoft.Quantum.Samples {

2

3 open Microsoft.Quantum.Arrays;

4 open Microsoft.Quantum.Canon;

5 open Microsoft.Quantum.Diagnostics;

6 open Microsoft.Quantum.Intrinsic;

7

8 ///# Summary 9 /// Preparesa given two qubits ina computational basis state.

10 operation PrepareBellPair(left : Qubit, right : Qubit): Unit is Adj + Ctl {

11 H(left);

12 CNOT(left, right);

13 }

14

15 ///# Summary 16 /// Teleports the state of the’msg’ qubit to the given’target’ qubit, 17 /// by temporarily usinga’helper’ qubit asa resource.

18 operation Teleport(msg : Qubit, target : Qubit): Unit {

19 //Q# supports local allocations of qubits.

20 using (helper = Qubit()) {

21 PrepareBellPair(helper, target);

22 Adjoint PrepareBellPair(msg, helper);

23 // We applya Pauli correction conditional on the outcomes of

24 // single-qubit measurements in computational basis.

25 (M(msg) == One ? Z | I)(target);

26 (M(helper) == One ? X | I)(target);

27 }

28 }

29

30 ///# Summary 31 /// Executesa teleportation experiment. 32 /// If it succeeds, then the returned measurement result is Zero.

33 operation TeleportationExperiment(prep : (Qubit => Unit is Adj + Ctl)) : Result {

34 // We allocate new qubits for the duration of the block.

35 using ((msg, target) = (Qubit(), Qubit())){

36 // We prepare the message to teleport using the given

37 // preparation routine and teleport the message.

38 prep(msg);

39 Teleport(msg, target);

40 // If the target qubit is in the intended state, this will map it to |0i.

41 Adjoint prep(target);

42 return M(target);

43 }

44 } 92 quantum programming languages

46 ///# Summary 47 /// Unit test to check that various states are teleported correctly.

48 // The Test attribute defines the target on which the test will be executed.

49 //"QuantumSimulator" indicates that the test will be executed on the full

50 // state simulator.

51 @Test("QuantumSimulator")

52 operation TeleportTest() : Unit {

53 // We want to execute the teleportation experiment for various messages.

54 let messages = [H, X, T];

55 for(rep in 1 .. 100) {

56 // We run the teleporation experiment for each message.

57 let results = ForEach(TeleportationExperiment, messages);

58 // We check that each run returned Zero using unit testing tools.

59 //’Fact’ will fail and print the give string if success is false.

60 let success = All(IsResultZero, results);

61 Fact(success, "Teleportation failed.");

62 }

63 }

64 }

The state preparation in Line 10 returns Unit, as quantum operations are modeled as side effects. Its adjoint operation as invoked in Line 22 is automatically generated by the compiler. The teleportation is executed on a full state simulator as part of a unit test defined via the Test attribute.

7.2.2 OpenQASM and Qiskit

The Open Quantum Assembly Language (OpenQASM, [135]) is a gate- based intermediate representation for quantum programs. It expresses quantum programs as lists of instructions, often intended to be consumed by a quantum processor without further compilation. OpenQASM allows for abstractions in the form of quantum gates, which can be composed in a hierarchical manner based on a set of intrinsic primitives that are assumed to be available on the targeted processor. An example being a Toffoli gate composed of CNOT gates, T gates, and Hadamard gates. OpenQASM also supports single-qubit measurement and basic classical control operations.

Qiskit [200, 201] provides a Python-based programming environment that allows one to generate and manipulate OpenQASM programs. It provides powerful abstraction capabilities such as the ability to synthesize gate decompositions for arbitrary isometries and certain unitary transformations. The ecosystem of Qiskit comprises of four software frameworks: 7.2 languages and ecosystems overview 93

terra provides fundamental data structures for quantum computing, aer provides various simulation backends for executing circuits compiled in Qiskit Terra as well as tools for noise modeling, aqua provides generalized and customizable quantum al- gorithms, including domain application support for chemistry, finance, machine learning, and optimiza- tion, ignis a framework to understand and mitigate noise in quantum programs using quantum characterization, verification, and validation (QCVV) protocols.

Qiskit can be used to access physical quantum computers through cloud access. An online service, called IBM Quantum Experience [150] allows to write quantum programs either through a visual interactive quantum circuit editor or using the Python-based libraries within Jupyter Notebooks. The IBM Quantum Experience also includes the OpenPulse framework for pulse- level control which allows users to construct their own schedules of pulses and execute them on IBM’s quantum . At the time of writing, Qiskit can target IBM’s quantum computers in addition to devices offered by Alpine Quantum Technologies [202] and Honeywell [203]. Layout and routing stages during compilation permit targeting limited-connectivity architectures without manually tailoring pro- grams to hardware restrictions. The compilation infrastructure furthermore includes several general purpose optimization passes for quantum circuit optimization. Besides its support for execution on quantum processors, Qiskit comes with extensive simulation capabilities, including statevector and density matrix simulators that can be executed on both CPUs and GPUs. It thus provides support for simulating the effects of noise defined by any custom model including arbitrary Kraus operators. Qiskit furthermore contains an efficient Clifford stabilizer state simulator as well as a tensor-network statevector simulator that uses a matrix product state representation for the state.

Program 7.2 shows an OpenQASM implementation of quantum teleport. It can be compiled using Qiskit Terra, and then dispatched to a simulator or to hardware using Qiskit Aer, as shown in Program 7.3. Alternatively, the 94 quantum programming languages

circuit can be constructed completely using the Python API, as illustrated in Program 7.4.

Program 7.2 OpenQASM code implementing teleportation of an arbitrary quantum state.

1 // All OpenQASM programs begin witha header indicating the language version.

2 OPENQASM 2.0;

3 include "qelib1.inc";

4

5 // We need to declare all quantum and classical registers that are going to be used.

6 // We will use qubits[0] as the qubit containing the message (’msg’), qubits[1] as

7 // the helper qubit (’helper’), and qubits[2] as the target qubit (’target’).

8 qreg qubits[3];

9 // Branching conditional on measurements outcomes requires passing ina classical

10 // register, hence we constructa separate register for thex- andz-correction.

11 creg corrz[1];

12 creg corrx[1];

13 // We also needa classical register to store the measurement outcome that

14 // indicates whether the teleportation succeeded.

15 creg final[1];

16

17 // We prepare the state |+i to teleport.

18 h qubits[0];

19

20 // Preparesa Bell state between the helper and the target qubit.

21 h qubits[1];

22 cx qubits[1], qubits[2];

23

24 // Perform the inverse operation on the message and helper qubit.

25 cx qubits[0], qubits[1];

26 h qubits[0];

27

28 // We measure the message and the helper qubit in the computational basis

29 // and store the results in the classical registers corrz and corrx.

30 measure qubits[0] -> corrz[0];

31 measure qubits[1] -> corrx[0];

32

33 // We applya Pauli correction conditional on the stored outcomes of

34 // the single-qubit measurements in computational basis.

35 if (corrz==1) z qubits[2];

36 if (corrx==1) x qubits[2];

37

38 // If the target qubit is indeed ina |+i state, this will map it to |0i.

39 h qubits[2];

40

41 // If the message is teleported successfully this measurement should always be 0.

42 measure qubits[2] -> final[0]; 7.2 languages and ecosystems overview 95

Program 7.3 Python code to execute OpenQASM programs using Qiskit.

1 from qiskit import QuantumCircuit, BasicAer, execute

2 import argparse# used to process command line arguments _ 3 from teleportation circuit import * # custom teleportation implementation in Qiskit

4

5 """

6 Executes the given QuantumCircuit on the qasm_simulator and returns the histogram.

7 """

8 def run_experiment(circuit):

9 # Qiskit Aer allows to simulate circuits using one of several different backends.

10 backend = BasicAer.get_backend(’qasm_simulator’)

11

12 # We can send the constructed circuit to an Aer backend to get an object

13 # that represents the asynchronous execution of our circuit.

14 # We repeat the simulation 100 times, and return the histogram.

15 job = execute(circuit, backend, shots=100)

16

17 # The result ofa job tells us whether the job completed successfully,

18 # and all measurement results obtained during the job.

19 result = job.result()

20 return result.get_counts(circuit)

21

22 if __name__ == "__main__":

23 # We do some minimal command line parsing to allow givinga qasm file as input.

24 parser = argparse.ArgumentParser()

25 parser.add_argument("--qasm", dest="qasm")

26 args = parser.parse_args()

27

28 if args.qasm:

29 # Qiskit Terra allows to load an OpenQASM file and compile it toa circuit.

30 print("Loading circuit from file " + args.qasm + ".")

31 circuit = QuantumCircuit.from_qasm_file(args.qasm)

32 else:

33 # Alternatively, we can also construct the circuit in Qiskit.

34 print("Running the circuit returned by teleportation_experiment.")

35 # We choose to teleporta\ket{+} state,

36 # but any function for the preparation routine could be passed in here.

37 circuit = teleportation_experiment(lambda prep, q: prep.h(q))

38

39 data = run_experiment(circuit)

40

41 # We check whether the teleportation succeeded for all shots,

42 #i.e. we check if the final measurement was always 0.

43 is_correct = lambda key: key.startswith(’0’)

44 success = all(map(is_correct, data.keys()))

45

46 if success: print("\nTeleportation succeeded!")

47 else: print("\nTeleportation failed.") 96 quantum programming languages

50 # and emit the executed circuit.A matrix representation cannot be printed

51 # since the circuit is not unitary.

52 print("\nFull histogram:")

53 print(data)

54 print("\nCircuit:")

55 print(circuit)

Program 7.4 Qiskit code to generate the OpenQASM circuit for teleportation

1 from qiskit import QuantumCircuit,QuantumRegister,ClassicalRegister,BasicAer,execute

2

3 """

4 Adds the gates to the given circuit to preparea Bell state between the two qubits.

5 """

6 def prepare_bell_pair(circuit, left, right):

7 circuit.h(left)

8 circuit.cx(left, right)

9

10 """

11 Adds the gates to the given circuit that implement the adjoint of the sequence of

12 gates added by prepare_bell_pair.

13 """

14 def adjoint_prepare_bell_pair(circuit, left, right):

15 # Qiskit supports generating the adjoint for QuantumCircuit instances, but

16 # in this case it is more convenient to just implement it as python function.

17 circuit.cx(left, right)

18 circuit.h(left)

19

20 """

21 Adds the gates to the given circuit that teleport the state of the’msg’ qubit

22 to the given’target’ qubit using the’helper’ qubit asa resource, and

23 using the classical registers corrz and corrx to store the measurement results.

24 """

25 def teleport(circuit, msg, helper, target, corrz, corrx):

26 prepare_bell_pair(circuit, helper, target)

27 adjoint_prepare_bell_pair(circuit, msg, helper)

28 # Add the gates to measure the message and the helper qubit in the computational

29 # basis and store the results in the classical registers corrz and corrx.

30 circuit.measure(msg, corrz[0])

31 circuit.measure(helper, corrx[0])

32 # Add the gates to applya Pauli correction conditional on the stored outcomes

33 # of the single-qubit measurements in computational basis.

34 # Note thatc _if expectsa ClassicalRegister, which is why we need two of them.

35 circuit.z(target).c_if(corrz, 1)

36 circuit.x(target).c_if(corrx, 1) 7.2 languages and ecosystems overview 97

38 """

39 Constructs and returnsa QuantumCircuit instance that applies the given state

40 preparation routine toa single qubit.

41 """

42 def prepare_message(prep):

43 qubits = QuantumRegister(1, "qubits")

44 circuit = QuantumCircuit(qubits, name="prep")

45 prep(circuit, qubits[0])

46 return circuit

47

48 """

49 Constructs and returnsa QuantumCircuit instance containinga teleportation experiment.

50 If the experiment succeeds, then the last classical bit in the circuit will be 0.

51 """

52 def teleportation_experiment(prep):

53 # We need to declare all quantum and classical registers that are going to be

54 # used as part of the circuit before we can construct the circuit instance.

55 qubits = QuantumRegister(3, "qubits")

56 msg, helper, target = map(lambda idx: qubits[idx], range(3))

57 # Branching conditional on measurements outcomes requires passing ina classical

58 # register, hence we constructa separate register for thex- andz-correction.

59 corrz = ClassicalRegister(1, "corrz")

60 corrx = ClassicalRegister(1, "corrx")

61 # We also needa classical register to store the measurement outcome that

62 # indicates whether the teleportation succeeded.

63 final = ClassicalRegister(1, "final")

64

65 # We constructa separate subcircuit implementing the passed in state preparation

66 # function such that we can easily invert it later.

67 prep = prepare_message(prep)

68

69 # We createa QuantumCircuit instance to which we then add gates.

70 circuit = QuantumCircuit(qubits, corrz, corrx, final, name="teleport")

71 # We plug the constructed subcircuit for preparing the message to teleport

72 # into the circuit and teleport the message.

73 circuit.append(prep.to_instruction(), [msg])

74 teleport(circuit, msg, helper, target, corrz, corrx)

75 # We apply the inverse of the message preparation circuit to the target qubit.

76 # If the target qubit is in the intended state, this will map it to |0i.

77 circuit.append(prep.inverse().to_instruction(), [target])

78 # If the message is teleported successfully this measurement should always be 0.

79 circuit.measure(target, final[0])

80 return circuit

The online documentation [147] gives a good overview over the full spec- trum of capabilities included in Qiskit, includes tutorials, and is generated for each release. In terms of learning resources, an interactive textbook [204] based on the new Jupyter Book platform [205] teaches beginners quantum computing by means of interactive code examples. These examples can easily be run via direct links to notebooks on the IBM Quantum Experience. 98 quantum programming languages

7.2.3 Cirq

Cirq is a quantum programming library for Python with a strong focus on supporting near-term quantum hardware. Cirq’s primary goal is to ease the development of quantum programs that are capable of running on quantum computers available now or in the near future. As a result, Cirq provides mechanisms for fine-tuning exactly how a quantum program executes on the quantum hardware, as well as tools for simulating hardware constraints, such as limitations due to noise or the physical layout of qubits [136].

The types of qubits available to the programmer demonstrates Cirq’s focus on NISQ hardware. The physical layout of qubits in a quantum com- puter can be modeled by a GridQubit for hardware with a two-dimensional lattice, or a LineQubit for hardware with a one-dimensional lattice [136]. Cirq also provides qubit types that do not impose any physical layout, either for developing quantum programs that are intended only to be simulated, or to use as part of the definition of a custom layout. Once a qubit type is chosen, device constraints can be specified programmatically and Cirq will validate that a particular circuit adheres to all of the constraints [136]. For example, a common hardware constraint is that two-qubit gates may only operate on adjacent qubits. In contrast to other languages where qubits may be allocated dynamically, layout is performed manually in Cirq. Qubits can only be allocated by providing their position (e.g., row and column for a GridQubit) or another globally-unique identifier (e.g., a string for a NamedQubit). This means the programmer must decide which physical qubits to use for each part of an algorithm, but as a result, they have the most control over how a NISQ computer’s limited number of qubits are being used. Similarly, the programmer has several options when it comes to schedul- ing each quantum operation. A circuit in Cirq is divided into “moments,” which are discrete units of time in which all operations in the same moment execute simultaneously [136]. Only a single operation can affect a particular qubit at any given moment. When operations are added to the circuit, they can be added as part of a new moment (increasing the total length of time of the program), or instead can “slide” back to an earlier moment if the affected qubits are not already being used at that time. This has trade-offs similar to those that come from the manual allocation of qubits: it requires the programmer to make more decisions, but gives the most flexibility in 7.2 languages and ecosystems overview 99 how the hardware is being used at every point in time while the program is running. Circuits in Cirq are defined declaratively as a sequence of moments, where each moment contains a set of gates to apply. Since Cirq is embedded in Python, it is easy to manipulate circuits, as they behave similarly to other Python sequences. For example, higher-level quantum operations can be created by defining Python functions which return a sequence of gates that can be appended to a circuit. It is also possible to iterate over, transform or filter the moments in a circuit. Furthermore, the operations in each moment can be inspected or transformed as well.

Cirq is embedded in Python. The control flow constructs provided by Python, such as if and while statements, can be used to construct a circuit before executing it. There is no direct support for control flow based on measurement results. When targeting simulators, it is possible to emulate control flow by retrieving the quantum state at the end of a simulation, performing arbitrary classical logic on any measurement results obtained, and initializing a new simulation to start in the previous state. When tar- geting quantum devices on the other hand, a similar scheme is not possible to run quantum algorithms that use non-deterministic circuits, i.e. circuits in which different operations are applied depending on measurements. This includes, e.g., the class of repeat-until-success algorithms [39–41, 206] and iterative phase estimation and potentially other primitives that require non-trivial control flow. This limitation of Cirq is reasonable to have, seeing how it mirrors the capabilities of NISQ hardware. It is also possible to use Cirq to model the effect of noise on a circuit by creating noisy quantum channels defined by Kraus operators [136]. Common channels are included with Cirq, such as channels that introduce bit flip or phase flip errors with probability p. Libraries for Cirq include OpenFermion-Cirq and TensorFlow Quan- tum [207]. Both are adaptations of existing libraries to provide interoper- ability with Cirq. OpenFermion-Cirq is based on OpenFermion, a quantum chemistry library [208]. TensorFlow Quantum [207] is based on Tensor- Flow [209], a machine learning library, adapted for use with .

It is possible to run Cirq programs against Google’s Quantum Cloud Service, but requires access, which according to a note in the source code is granted by invitation only [210]. Google plans to grant public access 100 quantum programming languages

to cloud-based quantum simulators and actual quantum hardware in the future [152]; currently Cirq supports running programs on a local simulator. In terms of third-party offerings, Alpine Quantum Technologies, in collabo- ration with the University of Innsbruck, supports both Cirq and Qiskit on ion-trap quantum computers in Innsbruck [154]. Cirq comes with device models for many of Google’s quantum processors, including Bristlecone and Sycamore [136]. The models can be used to deter- mine whether a circuit is suitable to run on a particular device, validating circuit characteristics such as gate set, dimensions of qudits (qubit, , etc.), and locality of multi-qubit operations. Device support can be extended to custom devices with their own set of constraints as well. In addition to simply validating whether a circuit conforms to a particular gate set, Cirq can transform circuits into the gate sets of Google devices, but transforming circuits into arbitrary gate sets is not currently supported [136].

Program 7.5 Cirq code implementing teleportation of an arbitrary quantum state.

1 import cirq 2 from cirq.ops import *

3

4 """

5 Preparesa Bell state given two qubits ina computational basis state.

6 """

7 def prepare_bell_pair(left, right):

8 yield H(left)

9 yield CNOT(left, right)

10

11 """

12 Teleports the state of the’msg’ qubit to the given’target’ qubit

13 using the’helper’ qubit asa resource.

14 """

15 def teleport(msg, helper, target):

16 yield prepare_bell_pair(helper, target)

17 yield cirq.inverse(prepare_bell_pair(msg, helper))

18 # We applya Pauli correction ina coherent manner, as Cirq does not

19 # support applying operations conditional ona measurements outcome.

20 yield CZ(msg, target)

21 yield CNOT(helper, target) 7.2 languages and ecosystems overview 101

23 """

24 Constructs and returnsa Circuit instance containinga teleportation experiment.

25 If the experiment succeeds, then the final measurement of the target qubit will be 0.

26 """

27 def teleportation_experiment(prep):

28 # We will use three qubits ona line.

29 msg, helper, target = map(cirq.LineQubit, range(3))

30

31 # We createa Circuit instance to which we then add gates.

32 circuit = cirq.Circuit()

33 # We add the gates to prepare the message to teleport using the given

34 # preparation routine and teleport the message.

35 circuit.append(prep(msg))

36 circuit.append(teleport(msg, helper, target))

37 # If the target qubit is in the intended state, this will map it to |0i.

38 circuit.append(cirq.inverse(prep(target)))

39 circuit.append(measure(target, key="final"))

40 return circuit

41

42 """

43 Executes the given Circuit on the simulator and returns the histogram.

44 """

45 def run_experiment(circuit):

46 # We simulate circuits using the local simulator.

47 backend = cirq.Simulator()

48 # We repeat the simulation 100 times, and return the histogram.

49 # The returned histogram contains the result of all measurements in all runs.

50 result = backend.run(circuit, repetitions=100)

51 return result.histogram(key="final")

52

53 if __name__ == "__main__":

54 # We choose to teleporta\ket{+} state,

55 # but any adjointable state preparation routine could be passed in here.

56 circuit = teleportation_experiment(H)

57 data = run_experiment(circuit)

58

59 # We check whether the teleportation succeeded for all shots,

60 #i.e. we check if the final measurement was always 0.

61 is_correct = lambda key: key == 0

62 success = all(map(is_correct, data.keys()))

63

64 if success: print("\nTeleportation succeeded!")

65 else: print("\nTeleportation failed.")

66

67 # We additionally print the the histogram of the measured value for all shots,

68 # emit the executed circuit, and print the matrix representation of the circuit.

69 print("\nFull histogram:")

70 print(data)

71 print("\nCircuit:")

72 print(circuit)

73 print("\nMatrix representation:")

74 print(cirq.unitary(circuit)) 102 quantum programming languages

Program 7.5 shows a Cirq implementation of quantum teleportation. Cirq does not support applying quantum gates conditioned on a classical control bit. Rather than a branching based on classical control bit, the example instead makes use of a controlled Z-gate (CZ) and a controlled X-gate (CNOT), thus replacing the classical control bit with a qubit. When run, the program prints both a graphical representation of the circuit and its representation as a unitary matrix, as well as the measurement results from simulating the circuit. It can be executed directly by the Python .

Thorough documentation for Cirq is available online [136], including an installation guide and a tutorial to help get new users familiar with Cirq. The documentation is generated for each release. In-depth sections are also available for more detailed descriptions of Cirq’s features, including a complete API reference. Since Cirq is a Python framework, it can also leverage the large variety of existing tools available for developing Python applications.

7.2.4 Quipper

Quipper [30] is a functional quantum programming language, embedded into Haskell. It addresses the problem of describing quantum computa- tions at a practical scale, and demonstrated by describing quantum circuit representations with up to trillions of quantum gates. Quipper uses the computation model that uses a classical computer to control a quantum device, and it is not dependent on any particular quantum hardware. In addition to being used to estimate the costs of quantum computations [211], the language provides a foundation for quantum programming language research. Quipper has served as inspiration for leveraging existing tools for formal verification for quantum programs [212, 213], and subsequent work [196, 214, 215] building on the ideas of Quipper has a distinct focus on how to facilitate reasoning about the semantics of quantum programs.

Quipper is a circuit description language, i.e., the language can be used in a structured way to construct circuits by applying gates to qubits. The circuits themselves are data that can be passed to functions in the host language Haskell, e.g., to perform circuit optimization, resource estimation, or error correction. As in many other embedded programming languages, the mismatch between the type system of the host language and the type system of the quantum programming language allows the developer to 7.2 languages and ecosystems overview 103 write programs that are not well-defined and lead to run-time errors. In par- ticular, Haskell is not able to enforce linearity, such that type system cannot guarantee that multiple operations do not simultaneously act on the same qubit. Stand-alone prototypical implementations of Quipper-like languages, such as Proto-Quipper-S [214], Proto-Quipper-M [196], and Proto-Quipper- D[215] have emerged with the goal to enforce quantum specific properties such as the no-cloning theorem of quantum information. They are based on a linear type system and take other quantum programming related con- cepts explicitly into account. For example, Proto-Quipper-M distinguishes between parameters, which are data known at circuit generation time (e.g., the bitwidth of an arithmetic operation) and states, which are data known at circuit execution time (e.g., a quantum state or measurement result). While Quipper allows to describe families of quantum circuits, Proto-Quipper-M is more general, in that it describes families of morphisms in any symmetric monoidal category [196]. The latest addition, Proto-Quipper-D, introduces linear dependent types that can be used to express program invariants and constraints, and allow for type-safe uncomputation of garbage qubits [215].

Program 7.6 Quipper code implementing teleportation of an arbitrary quantum state.

1 module Teleport where

2 import Quipper

3

4 -- Preparesa Bell state given two qubits ina computational basis state.

5 prepareEntangledPair :: (Qubit, Qubit) -> Circ (Qubit, Qubit)

6 prepareEntangledPair (left, right) = do

7 gate_H_at left

8 right <- qnot right ‘controlled‘ left

9 return (left, right)

10

11 -- Teleports the state of the’msg’ qubit to the given’target’ qubit

12 -- using the’helper’ qubit asa resource.

13 teleport :: (Qubit, Qubit, Qubit) -> Circ (Qubit)

14 teleport (msg, helper, target) = do

15 (helper, target) <- prepareEntangledPair (helper, target)

16 (msg, helper) <- reverse_simple prepareEntangledPair (msg, helper)

17 -- We applya Pauli correction based on the Bell measurement.

18 (z, x) <- measure (msg, helper) -- measuring two qubits in computational basis

19 target <- gate_X target ‘controlled‘ x --X classically controlled on bitx

20 target <- gate_Z target ‘controlled‘ z --Z classically controlled on bitz

21 cdiscard (z, x) -- discarding classical bits

22 return target 104 quantum programming languages

25 teleportTest :: Circ Qubit

26 teleportTest = do

27 with_ancilla $ \reference -> do

28 (msg, helper, target) <- qinit (False, False, False)

29 (reference, msg) <- prepareEntangledPair (reference, msg)

30 target <- teleport (msg, helper, target)

31 -- If the teleportation circuit is correct, then the joint state

32 -- of‘reference‘ and‘target‘ must bea Bell pair

33 (reference, target) <-

34 reverse_simple prepareEntangledPair (reference, target)

35 -- with_ancilla asserts that‘reference‘ is ina zero-state

36 return target

Program 7.7 Haskell code that instantiates the target machine for Program 7.6.

1 import System.Random

2 import Teleport

3 -- The core Quipper module provides functions like print_simple.

4 import Quipper

5 -- This module provides simulators for use with Quipper.

6 import Quipper.Libraries.Simulation -- simulation functions including run_generic

7

8 main :: IO ()

9 main = do

10

11 -- Print out the circuit using the print_simple function.

12 print_simple ASCII teleportTest

13

14 -- Show the circuit in the previewer.

15 -- The following line can be uncommented outside the Docker container.

16 -- print_simple Preview teleportTest

17

18 -- Show gate counts for the circuit.

19 print_simple GateCount teleportTest

20

21 -- Efficiently simulate the circuit using the Clifford simulator.

22 run_clifford_generic teleportTest >>= print

23

24 -- Simulate the circuit using the full state-vector simulator.

25 random_number_generator <- newStdGen

26 print $ run_generic random_number_generator (0.0 :: Double) teleportTest 7.2 languages and ecosystems overview 105

Quipper represents a quantum program as the accumulation of quantum side effects in a monad, called Circ. The side effects accumulated in a Circ can be run by using a function in the QuipperLib.Simulation module. Quipper supports high-level circuit combinators, e.g., for circuit reversal and iteration, and has the capability to compute gate counts for a particu- lar circuit [216]. Part of the resource estimation framework is the concept of “boxing” of subroutines which allows to obtain gate counts and other metrics in a scalable way. The monadic setup allows Quipper to leverage patterns of the form UVU† for controlled applications, similar to, e.g., Q#.

Program 7.7 demonstrates how to use the teleport function defined in Program 7.6 with functions such as “Quipper.print_simple ASCII” and “QuipperLib.Simulation.run_generic random_number_generator” to ex- port or run Quipper programs against simulation resources.

7.2.5 Scaffold

Scaffold is designed for expressing quantum algorithms in a high level for- mat that can be compiled into low-level implementations whose properties can be studied [176]. Analyzing these properties can help one understand what hardware capabilities are needed to feasibly execute different kinds of quantum algorithms. Scaffold is a stand-alone language. It is designed to be similar to exist- ing classical programming languages, particularly C: Scaffold adopts C’s imperative programming model and many of its familiar features such as functions (known as modules in Scaffold), if statements, loops, structures, and preprocessor directives [176]. In addition, Scaffold programs can au- tomatically convert classical functions into reversible logic, implemented using quantum gates, so that they can be embedded as an oracle in a larger quantum algorithm [176, 217].

ScaffCC, the Scaffold compiler, can analyze programs by estimating quan- tum resource usage and critical path length [217], and is designed to scale to trillions of gates. ScaffCC obtains more accurate resource estimates by applying classical and quantum optimizations and configurable quantum gate decompositions to programs. To estimate the critical path length, Scaf- fCC creates a schedule of the order and timing of quantum operations, and optimizes that schedule using properties of the quantum circuit, accounting for gate dependencies. ScaffCC is open source and based on LLVM [217]. It 106 quantum programming languages

integrates with other quantum computing tools; Scaffold supports RevKit for Quantum Computation (RKQC) [177], an adaptation of the reversible circuit design toolkit RevKit [218] for use with quantum circuits. RKQC is used to compile oracles embedded in Scaffold programs. In addition to creating resource estimates, ScaffCC supports generating output in three different variants of QASM: “hierarchical” QASM, “flattened” QASM, and OpenQASM, where hierarchical QASM preserves Scaffold modules, while flattened QASM and OpenQASM do not. Lastly, ScaffCC can also generate input files for QX Simulator [177, 219], a third-party .

Program 7.8 shows a Scaffold implementation of quantum teleportation. It can be compiled into QASM by invoking the ScaffCC compiler. It should be noted that while the example compiles without errors or warnings, the generated QASM code is missing the classical control flow (namely, the if statements needed for the correction of the teleported qubit). We refer to the corresponding issues on the ScaffCC GitHub repository [172] for further details.

Program 7.8 Scaffold code implementing teleportation of an arbitrary quantum state.

1 // Preparesa Bell state given two qubits ina computational basis state.

2 module PrepareBellPair(qbit left, qbit right) {

3 H(left);

4 CNOT(left, right);

5 }

6

7 // Implement the adjoint of the sequence of gates added by PrepareBellPair.

8 module AdjointPrepareBellPair(qbit left, qbit right) {

9 CNOT(left, right);

10 H(left);

11 }

12

13 // Teleports the state of the’msg’ qubit to the given’target’ qubit

14 // using the’helper’ qubit asa resource.

15 module Teleport(qbit msg, qbit helper, qbit target) {

16 PrepareBellPair(helper, target);

17 AdjointPrepareBellPair(msg, helper);

18 if (MeasZ(msg)) { Z(target); }

19 if (MeasZ(helper)) { X(target); }

20 } 7.2 languages and ecosystems overview 107

22 int main() {

23 // We will use qubits[0] as the qubit containing the message (’msg’), qubits[1]

24 // as the helper qubit (’helper’), and qubits[2] as the target qubit (’target’).

25 // We assume that newly allocated qubits are in the |0i state.

26 qbit qubits[3];

27

28 // We choose to teleporta\ket{+} state.

29 // We initialize the message and teleport it to the target qubit.

30 H(qubits[0]);

31 Teleport(qubits[0], qubits[1], qubits[2]);

32

33 // We measure the target qubit in X basis to verify if the teleportation worked.

34 // If the teleportation succeeded, we return1 and we return -1 otherwise.

35 if (MeasX(qubits[2])) { return 1; }

36 else { return -1; }

37 }

Some learning resources are available for Scaffold. The Scaffold language report describes the initial language’s features with code examples [176]. The ScaffCC GitHub repository [174] includes a user guide for installing and using the compiler, as well as several examples of common quantum algorithms such as quantum Fourier transformation, Shor’s algorithm, or a variational quantum eigensolver algorithm implemented in Scaffold. 8 DOMAIN-SPECIFICLANGUAGEQ#

This chapter contains some minimal content from Publication III.

As we have seen, the variety of areas in quantum computing where hav- ing a domain-specific language is desireable would lead to contradicting requirements if one language was to attempt to cover all of them. Designing a language for quantum computing hence necessitates a clear understand- ing of what the language is supposed to be used for. In the case of Q#, our goal is to enable the development of quantum applications that have a last- ing benefit to society - applications with solutions that cannot be achieved by other means. This is of course a very ambitious and long-term vision, and there will be a lot of small steps hopefully progressing in that direction. As of the time of writing, Q# is still in preview, and this chapter merely reflects the first steps in hopefully a long path of continuous improvements.

To give a more precise description of the mission we set out to pursue, we will start with a set of design principles that govern the development of the Q# programming language. Q# is part of Microsoft’s Quantum De- velopment Kit, available at http://www.microsoft.com/quantum. Detailed documentation for Q#, including the standard library reference, is available at http://docs.microsoft.com/quantum.

8.1 design principles

A multitude of aspects ultimately factor into the decision to pursue a certain design direction. Given the early stages of quantum computing and the uncertainty around the architecture of future quantum hardware, designing a high-level language is challenging. Nonetheless, we believe it is worth the effort. The following list may give some insight into the principles guiding the Q# language design:

q# is hardware agnostic. We strive to design a language that provides the means to express and

108 8.2 program structure and execution 109

leverage powerful quantum computing concepts independent on how hardware evolves in the future. q# is designed to scale to the full range of quantum applications. To be useable across a wide range of applications, Q# allows to build reusable components and layers of abstractions. To achieve perfor- mance with growing quantum hardware size we need automation. We want to ensure the scalability of both applications and development effort. q# is meant to make quantum solutions accessible and shareable across disciplines. We are designing Q# to enable people to collaborate across disciplines, to make it easy to build on knowledge and ideas, independent of background or education. q# is focused on expressing information to optimize exe- cution. Our goal is to ensure an efficient execution of quantum components, independent of the context within which they are invoked. Q# allows the developer to communicate their knowledge about a computation so that the compiler can make an informed decision regarding how to translate it into instructions, leveraging information about the end-to- end application that is not available to the developer. q# is a living body of work that will grow and evolve over time. We share a vision of how quantum devices will revolutionize comput- ing in the future. We also believe that the quantum stack of the future will go beyond our current imagination. Correspondingly, our vision for Q# will adapt and change as the technology advances. Let’s take a look at how these principle are reflected in the current version of our language, as well as possible paths for improvements.

8.2 program structure and execution

Q# is an algorithm description language, and as such naturally represents the composition of classical and quantum algorithms alike. It is a stand- alone language offering a high level of abstraction; there is no notion of 110 domain-specific language q#

a quantum state or a circuit in Q#. Instead, programs are implemented in terms of statements and expressions, much like in classical programming languages. They are enriched with distinct quantum capabilities such as support for functors, tracking of quantum-specific information, and control- flow constructs that are commonly used in quantum algorithms like, e.g., repeat-until-success loops. Such loops cannot be represented as a circuit without introducing new and specialized gates and are more easily treated as hybrid quantum–classical constructs. The type system provides a tightly constrained environment to safely in- terleave classical and quantum computations. Specialized syntax, symbolic code manipulation to automatically generate quantum transformations, and powerful functional constructs furthermore aid composition. Q# thus makes it easy to express for instance phase estimation [44, 220, 221] and quantum chemistry [222] algorithms, both of which require rich quantum–classical interactions. Q# enables working with clean and borrowed quantum mem- ory for resource optimization in quantum algorithms.

Program 8.1 gives a first glimpse at how a Q# command line application is implemented. Line 28 indicates that the operation Main is the entry point of the application, i.e. the operation that is invoked when running the application from the command line. The corresponding project file is shown in Listing 8.1. To build the application, follow the installation instructions in Ref. [146]. Then put both files in the same folder and run “dotnet build ”, where is to be replaced with the name of the file containing Listing 8.1.

Listing 8.1 Project file for Program 8.1.

1

2

3

4 Exe

5 netcoreapp3.1

6

7

8 8.2 program structure and execution 111

Line 4 in Listing 8.1 indicates that the project is executable and contains an entry point. Line 1 specifies the version number of the kit used to build the application. To execute the program after having built it, run the command

dotnet run --no-build --vector 1. 0. 0. 0. Line 40 in Program 8.1 initializes a quantum state where the amplitudes for each basis state correspond to the normalized entries of the specified vector. In Line 44, a quantum Fourier transformation (QFT) is then applied to that state. See Refs. [223, 224] for more background on this operation.

Program 8.1 Q# command line application executing a quantum Fourier transformation.

1 namespace Microsoft.Quantum.Samples {

2

3 open Microsoft.Quantum.Arithmetic;

4 open Microsoft.Quantum.Arrays as Array;

5 open Microsoft.Quantum.Canon;

6 open Microsoft.Quantum.Convert;

7 open Microsoft.Quantum.Diagnostics as Diagnostics;

8 open Microsoft.Quantum.Intrinsic;

9 open Microsoft.Quantum.Math;

10 open Microsoft.Quantum.Preparation;

11

12 operation ApproximateQFT (a : Int, reg : LittleEndian) : Unit

13 is Adj + Ctl {

14

15 let qs = reg!;

16 SwapReverseRegister(qs);

17

18 for (i in Array.IndexRange(qs)) {

19 for (j in 0..(i-1)) {

20 if ( (i-j) < a ) {

21 Controlled R1Frac([qs[i]], (1, i - j, qs[j]));

22 }

23 }

24 H(qs[i]);

25 }

26 } 112 domain-specific language q#

28 @EntryPoint()

29 operation Main(vector : Double[]) : Unit {

30

31 let n = Floor(Log(IntAsDouble(Length(vector))) / LogOf2());

32 if (1 <<< n != Length(vector)) {

33 fail "Length(vector) needs to be a power of two.";

34 }

35

36 let amps = Array.Mapped(ComplexPolar(_,0.), vector);

37 using (qs = Qubit[n]) {

38 let reg = LittleEndian(qs);

39

40 PrepareArbitraryState(amps, reg);

41 Message("Before QFT:");

42 Diagnostics.DumpRegister((), qs);

43

44 ApproximateQFT(n, reg);

45 Message("After QFT:");

46 Diagnostics.DumpRegister((), qs);

47

48 ResetAll(qs);

49 }

50 }

51 }

As expected, the invocation above will output that the amplitudes of the quantum state after application of the QFT are evenly distributed and real. Of course, the reason that we can so readily output the amplitudes of the state vector is that the above program is by default executed on a full state simulator, which supports outputting the tracked quantum state via DumpRegister. If we were to execute it, e.g., on the resources estimation tool with the command

dotnet run --no-build \ --simulator=ResourcesEstimator \ --vector 1. 0. 0. 0.

we see that the two calls to DumpRegister don’t do anything. The same is true when the program is executed on quantum hardware. This can be seen by targeting the application to a particular hardware platform by adding the project property 8.3 global constructs 113

honeywell.qpu after Line 4, then building and executing it by running the command “dotnet run --vector 1. 0. 0. 0.”. More details on debugging and test- ing tools can be found in Section 8.7.

8.3 global constructs

As we can see in Program 8.1, declarations are grouped together in names- paces. Namespaces are in fact the only top-level elements, and anything else needs to be contained in a namespace. Q# does not support nested namespaces. Namespaces can span multiple files. By default, everything declared within the same namespace can be accessed without further qualification, whereas declarations in a different namespace can only be used either by qualifying their name with the name of the namespace they belong to, or by opening that namespace before use, as it is done in the Lines 3-10. Such open directives need to precede any other namespace elements, and are valid throughout the namespace piece in that file only. It is possible to define an alternative usually shorter name for a particular namespace to avoid having to type out the full name but still distinguish where certain elements came from. This is done, e.g., in Line 4 and Line 7. Defining namespace aliases is particularly helpful in combination with the code completion functionality provided by the Q# extensions available for Visual Studio Code and Visual Studio; If the extension is installed, typing the namespace alias followed by a dot will show a list of all available elements in that namespace that can be used at the current location. Aside from open directives, namespaces can also contain operation, func- tion and type declarations. These may occur in any order, and are recursive by default, meaning they can be used in any order and call itself.

8.3.1 Type declarations

Q# has minimal support for custom types. Custom types are similar to record types in F#; they are immutable but support a copy-and-update construct, see Section 8.5.4 for more details. Such custom types may con- tain both named and anonymous items. Section 8.5.5 describes how the contained items can be accessed. 114 domain-specific language q#

The following declaration within a namespace for instance defines a type Complex which has two named items Real and Imaginary, both of type Double:

newtype Complex = (Real: Double, Imaginary : Double); User defined types are particularly useful for two reasons. For one, as long as the libraries and programs that use the defined types access items via their name rather than by deconstruction, the type can be extended to contain additional items later on without breaking any of the library code. Accessing items via deconstruction is hence generally discouraged. Furthermore, they allow to clearly convey the intent and expectations for a certain data type. For instance, the arithmetic library includes quan- tum arithmetic operations for both big-endian and little-endian quantum integers. It hence defines two types, BigEndian and LittleEndian, both of which contain a single anonymous item of type Qubit[]:

newtype BigEndian = Qubit[]; newtype LittleEndian = Qubit[]; This allows operations to specify whether they are written for big-endian or little-endian representations, and leverages the type system to ensure at compile-time that mismatched operands aren’t allowed.

Types may not have circular dependencies in Q#; defining something like a directly or indirectly recursive type is not possible, i.e. the following construct will give a compilation error:

newtype Foo = (Foo, Int); // gives an error newtype Bar = Baz; // gives an error newtype Baz = Bar; // gives an error Udt constructors are automatically generated by the compiler. Currently, it is not yet possible to define a custom constructor, though this would certainly be a desireable addition to the language in the future.

8.3.2 Callable declarations

Q# supports two kinds of callables: operations and functions, see Sec- tion 8.6.4 regarding the distinction between the two. Even more, Q# in fact supports defining templates, i.e. type parametrized implementations for a certain callable. Type parameterizations are described in more detail in Section 8.6.6. 8.3 global constructs 115

Naturally, such type parametrized implementations may not use any language constructs that rely on particular properties of the type arguments; there is currently no way to express type constraints in Q#. However, it is conceivable to introduce a suitable mechanism, similar to, e.g., type classes in Haskell, to allow for more expressiveness in the future. Such a mechanism could also be combined with permitting to define specialized implementations for particular type arguments, similar to how, e.g., C++ does, or for type arguments that belong to certain type classes. While this is not yet supported either, Q# does already have the notion of specializing implementations for certain purposes; operations in Q# can implicitly or explicitly define support for certain functors, and along with it the specialized implementations that are to be invoked when a certain functor is applied to that callable.

A functor in a sense is a factory that define a new callable implementation that has a certain relation to the callable it was applied to. Functors are more than traditional higher-level functions since they require access to the implementation details of the callable they have been applied to. In that sense, they are similar to other factories, such as templates. Correspondingly, they can be applied not just to callable, but in fact to templates as well. Program 8.1 for instance defines the two operations ApproximateQFT and Main, which is used as entry point. ApproximateQFT takes an tuple- valued argument containing an integer and a value of type LittleEndian, and returns a value of type Unit. The annotation is Adj + Ctl in the declaration of ApproximateQFT indicates that the operation supports both the Adjoint and the Controlled functor, see also Section 8.6.5. If Unitary is an operation that has an adjoint and a controlled specialization, the expression Adjoint Unitary accesses the specialization that implements the adjoint of Unitary, and Controlled Unitary the one that implements the controlled version of Unitary. The controlled version of an operation takes an array of control qubits in addition to the argument of the original operation, and applies the original operation conditional on all of these control qubits being in a |1i state. While in theory, an operation for which an adjoint version can be defined should also have a controlled version and vice versa, in practice it may be hard to come up with an implementation for one or the other, especially for probabilistic implementations following a repeat-until-success pattern [206]. For that reason, Q# allows to declare support for each functor individu- ally. However, since the two functors commute, an operation that defines 116 domain-specific language q#

support for both necessarily also has to have a usually implicitly defined – meaning compiler-generated – implementation for when both functors are applied to the operation.

There are no functors that can be applied to functions, such that functions currently have exactly one body implementation and no further specializa- tions. The declaration

function Hello (name : String): String { return $"Hello, {name}!"; } is equivalent to

function Hello (name : String): String { body (...) { return $"Hello, {name}!"; } } Here, body specifies that the given implementation applies to the default body of the function Hello, meaning the implementation that is invoked when no functors or other factory mechanisms have been applied prior to invocation. The three dots in body (...) correspond to a compiler directive indicating that the argument items in the function declaration should be copy-pasted into this spot. The reasoning behind explicitly indicating where the arguments of the parent callable declaration are to be copy-pasted is that for one, it is unnecessary to repeat the argument declaration, but more importantly it ensures that functors that require additional arguments, like the Controlled functor, can be introduced in a consistent manner. The same applies to operations; when there is exactly one specialization defining the implementation of the default body, the additional wrapping of the form body (...){ } may be omitted.

8.3.3 Specialization declarations

As explained in the previous section, there is currently no reason to explic- itly declare specializations for functions. This may change in the future if we decide to introduce type classes and/or type specializations. For now, this section applies to operations and elaborates on how to declare the necessary specializations to support certain functors. 8.3 global constructs 117

As detailed in Section 6.1, it is quite a common problem in quantum computing to require the adjoint of a given transformation. Many quantum algorithms require both an operation and its adjoint in order to perform a computation. Q# is able to employ symbolic computation to automatically generate the corresponding adjoint1 implementation for a particular body implementation. That generation is possible even for implementations that freely mix classical and quantum computations. There are, however, a couple of restrictions that apply in that case. For instance, auto-generation is not supported for performance reasons if the implementation makes use of mutable variables. Moreover, each operation called within the body for which to generate the corresponding adjoint needs to support the Adjoint functor itself. As explained in Section 6.1, even though measurements cannot easily be undone in the multi-qubit case, it is possible to combine measurements in such a way that the applied transformation is unitary. In that case this means that even though the body implementation contains measurements which each one on its own doesn’t support the Adjoint functor, the body in its entirety is adjointable. Nonetheless, auto-generating the adjoint im- plementation will fail in this case. For that reason, it is possible to manually specify that implementation. The correctness of such a manually specified implementation is not verified by the compiler. Similarly, it is possible to specify the controlled2 implementation manu- ally if desired. This is more common since in some cases, it may be possible to define a more optimized version by hand. It is expected that the benefit of hand-optimized implementations will decline over time as Q# becomes more expressive, capturing more and more optimization relevant patterns such as it is the case, e.g., for conjugations (see Section 8.4.7).

The declaration for an operation SWAP in Listing 8.2, that exchanges the state of two qubits q1 and q2, for example declares an explicit specialization for its adjoint version (in Line 10) and its controlled version (in Line 14). While the implementations for Adjoint SWAP and Controlled SWAP are thus user defined, the compiler still needs to generate the implementation for the combination of both functors (Controlled Adjoint SWAP, which is the same as Adjoint Controlled SWAP).

1 See “adjoint operations” in Section 6.1. 2 See “controlled operations” in Section 6.1. 118 domain-specific language q#

When determining how to generate a certain specialization, the compiler will prioritize user defined implementations, meaning if an adjoint special- ization is user defined and a controlled specialization is auto-generated, then the controlled adjoint specialization is generated based on the user defined adjoint, and vice versa. In this case, both specializations are user defined. As the auto-generation of an adjoint implementation is subject to more limitation, the controlled adjoint specialization in this case will default to generating the controlled specialization of the explicitly defined implementation of the adjoint specialization.

1 operation SWAP (q1 : Qubit, q2 : Qubit): Unit

2 is Adj + Ctl {

3

4 body (...) {

5 CNOT(q1, q2);

6 CNOT(q2, q1);

7 CNOT(q1, q2);

8 }

9

10 adjoint (...) {

11 SWAP(q1, q2);

12 }

13

14 controlled (cs, ...) {

15 CNOT(q1, q2);

16 Controlled CNOT(cs, (q2, q1));

17 CNOT(q1, q2);

18 }

19 }

Listing 8.2 Upon declaration, each operation defines which functors can be applied to it, and how the resulting transformation is to be implemented. The implementation for particular specializations to support certain func- tors can either be specified explicitly as it is done for the adjoint and controlled specialization here, or generated by the compiler based on a suitable directive that can be specified explicitly or inferred by the compiler, as it is the case for the controlled adjoint specialization here. 8.3 global constructs 119

In the case of the SWAP implementation, the better option, however, is to adjoint the controlled specialization to avoid unnecessarily conditioning the execution of the first and the last CNOT on the state of the control qubits. We can force the compiler to generate the controlled adjoint specialization based on the manually specified implementation of the controlled version by adding an explicit declaration for the controlled adjoint version that specifies a generation directive. Such an explicit declaration of a specialization that is to be generated by the compiler takes the form

controlled adjoint invert; and is to be inserted inside the declaration of SWAP (e.g., after Line 18). Inserting the line

controlled adjoint distribute; on the other hand would force the compiler to generate the specialization based the defined (or generated) adjoint specialization. For the operation SWAP, there is a better option. As we can see, the user defined implementation of the adjoint merely calls the body of SWAP; the operation SWAP is self adjoint, i.e. it is its own inverse. This can be expressed with the directive

adjoint self; Declaring the adjoint specialization in that manner will ensure that also the controlled adjoint specialization that is automatically inserted by the compiler will merely invoke the controlled specialization. That information is furthermore relevant for optimization; two subsequent invocations of SWAP or Controlled SWAP with the same arguments can simply be removed from the program, see also Section 8.6.5 for further elaborations on this topic.

The following generation directives exist and are valid: body specialization: - adjoint specialization: self, invert controlled specialization: distribute controlled adjoint specialization: self, invert, distribute That all generation directives are valid for a controlled adjoint specialization is not a coincidence; as long as functors commute, the set of valid generation directives for implementing the specialization for a combination of functors 120 domain-specific language q#

is always the union of the set of valid generators for each individual one.

In addition to the above listed directives, the directive auto is always valid; it indicates that the compiler should automatically pick a suitable generation directive. The declaration

operation DoNothing() : Unit { body (...) { } adjoint auto; controlled auto; controlled adjoint auto; } is equivalent to

operation DoNothing() : Unit is Adj + Ctl {}

The annotation is Adj + Ctl here specifies the operation characteristics, which contain the information about what functors a certain operation supports (see Section 8.6.5). While for readability’s sake it is recommended that each operation is an- notated with a complete description of its characteristics, the compiler will automatically insert or complete the annotation based on explicitly declared specializations. Conversely, the compiler also generates specializations that haven’t been declared explicitly but need to exist based on the annotated characteristics. We say these specializations have been implicitly declared by the given annotation. The compiler automatically generates the necessary specializations if it is able to, picking a suitable directive. Q# thus supports inference of both operation characteristics and existing specializations based on (partial) annotations as well as explicitly defined specializations.

In the future, it may be possible to extend callables declared in a ref- erenced assembly with additional specializations. This would mean that specializations in general could be declared outside the callable declara- tion at the global scope. They would thus look much more like individual overloads for the same callable, with the caveat that certain restrictions to what overloads can be declared would apply. This, however, still requires significant engineering work. 8.4 statements 121

8.4 statements

As we have seen in the code examples in the previous section, the imple- mentation for a certain specialization of a callable consists of a mixture of classical and quantum computations and looks much like in any other classical programming language. Opposed to some other functional languages like F# for example, which is entirely expression based, Q# distinguishes between statements and expressions. Optically it currently resembles much more C# or C++ with its curly braces and semicolon - a decision which may be worthwhile revising in the future. Some statements, such as the let and mutable bindings, are well-known from classical languages, while others such as conjugations or qubit allocations are unique to the quantum domain. The following statements are currently available in Q#:

expression statement An expression statement consists of an operation or function call re- turning Unit. The invoked callable needs to satisfy the requirements imposed by the current context. See Section 8.4.6 for more details. return statement A return statement terminates the execution within the current callable context and returns control to the caller. Any finalizing tasks are exe- cuted after the return value is evaluated but before control is returned. See Section 8.4.3 for more details. fail statement A fail statement aborts the execution of the entire program, collecting information about the current program state before terminating in an error. It aggregates the collected information and presents it to the user along with the message specified as part of the statement. See Section 8.4.3 for more details. variable declaration Defines one or more local variables that will be valid for the remainder of the current scope, and binds them to the specified values. Variables can be permanently bound or declared to be reassignable later on. See Section 8.4.2 for more details. value update Variables that have been declared as being reassignable can be rebound to contain different values. See Section 8.4.2 for more details. 122 domain-specific language q#

iteration An iteration is a loop-like statement that during each iteration assigns the declared loop variables to the next item in a sequence (a value of array or Range type) and executes a specified block of statements. See Section 8.4.5 for more details.

while statement If a specified condition evaluates to true, a block of statements is executed. The execution is repeated indefinitely until the condition evaluates to false. See Section 8.4.5 for more details.

repeat statement Quantum-specific loop that breaks based on a condition. The statement consists of an initial block of statements that is executed before a specified condition is evaluated. If the condition evaluates to false, a subsequent fixup-block is executed before entering the next iteration of the loop. The loop terminates only once the condition evaluates to true. See Section 8.4.5 for more details.

if statement The statement consists of one or more blocks of statements, each pre- ceded by a boolean expression. The first block for which the boolean ex- pression evaluates to true is executed. Optionally, a block of statements can be specified that is executed if none of the conditions evaluates to true. See Section 8.4.4 for more details.

conjugation A conjugations is a special quantum-specific statement, where a block of statements that applies a unitary transformation to the quantum state is executed, followed by another statement block, before the trans- formation applied by the first block is reverted again. In mathematical notation, conjugations describe transformations of the form U†VU to the quantum state. See Section 8.4.7 for more details.

qubit allocation Instantiates and initializes qubits and/or arrays of qubits, and binds them to the declared variables. Executes a block of statements. The instantiated qubits are available for the duration of the block, and will be automatically released when the statement terminates. See Section 8.4.1 for more details. 8.4 statements 123

8.4.1 Quantum memory management

A program always starts with no qubits, meaning passing values of type Qubit cannot be passed as entry point arguments. This restriction is inten- tional, since a purpose of Q# is to express and reason about a program in its entirety. Instead, a program allocates and releases quantum memory as it goes. In this regard, Q# models the quantum computer as a qubit heap. Rather than supporting separate allocate and release statements or func- tions, Q# has two statements to instantiate qubit values, arrays of qubits, or any combination thereof. Both of these statements gather the instantiated qubit values, bind them to the variable(s) specified in the statement, and then execute a block of statements. At the end of the block, the bound variables go out of scope and are no longer defined. These statements are thus block statements – meaning their body contains a block of statements –, and the instantiated qubit values can only be accessed within their body. Forcing that qubits cannot escape their scope greatly facilitates reasoning about quantum dependencies and how the quantum parts of the compu- tation can impact the continuation of the program. An additional benefit of this setup is that qubits cannot get allocated and never freed, which avoids a class of common bugs in manual memory management languages without the overhead of qubit garbage collection.

Q# distinguishes between the allocation of “clean” qubits, meaning qubits that are unentangled and are not used by another part of the computa- tion, and “dirty” qubits, described in more detail below. Clean qubits are allocated by the using-statement, and are guaranteed to be in a |0i state upon allocation. They are released at the end of the scope and are required to either be in a |0i state upon release, or to have been measured right beforehand. Of course, this requirement cannot be compiler-enforced in general, since this would require a symbolic evaluation that quickly gets prohibitively expensive. Execution on a special simulator for validation pur- poses, however, allows to do some decent checks whether that requirement is satisfied. Some quantum algorithms are capable of using qubits without relying on their exact state - or even that they are unentangled with the rest of the system. That is, they require extra qubits temporarily, but they can ensure that those qubits are returned exactly to their original state independent on which state that was. This means that if there are qubits that are in use but not touched during the execution of a subroutine, those 124 domain-specific language q#

Figure 8.1 Working with clean versus borrowed qubits, as described in Ref. [27]. The decomposition on the left side relies on the helper qubits 6-8 being in a zero-state and unentangled with any other qubits. The implementation on the right, in contrast, doesn’t have any such requirement, and guaran- tees the correct execution of the multi-controlled NOT gate independent on the state of those qubits. This comes at the cost of having to execute additional gates. The helper qubits 6-8 are guaranteed to be in the same state they were in initially after executing the depicted gate sequence.

qubits can be borrowed for use by such an algorithm instead of having to allocate additional quantum memory. Borrowing instead of allocating can significantly reduce the overall quantum memory requirements of an algorithm, and is a quantum example of a typical space-time tradeoff. An example for this and how it is achieved is given in Figure Figure 8.1. Q# has a dedicated borrowing-statement for such a qubit use, where the qubits are returned at the end of the allocation scope so that they can no longer be accessed.

For the using-statement, the qubits are allocated from the quantum com- puter’s free qubit heap, and then returned to the heap. For the borrowing- statement, the qubits are allocated from in-use qubits that are guaranteed not to be used during the body of the statement, and left in their original state at the end. If there aren’t enough qubits available to borrow, then qubits will be allocated from and returned to the heap.

8.4.2 Variable declarations and updates

Values can be bound to symbols via let- and mutable-statements. Such bindings provide a convenient way to access a value via the defined handle. Despite the somewhat misleading terminology, borrowed from other lan- guages, we will call handles that are declared on a local scope and contain 8.4 statements 125 values variables. The reason that this may be somewhat misleading is that let-statements define “single-assignment handles”, i.e. handles that for the duration of their validity will always be bound to the same value. Variables that can be re-bound to different values at different points in the code need to be explicitly declared as such, as specific by the mutable-statement.

let var1 = 3; mutable var2 = 3; set var2 = var2 + 1;

Line 1 declares a variable named var1 that cannot be reassigned and will always contain the value 3. Line 2 on the other hand defines a variable var2 that is temporarily bound to the value 3, but can be reassigned to a different value later on. Such a reassignment can be done via a set-statement, as shown in Line 3. The same could have been expressed with the shorter version set var2 += 1; explained further below, as it is common in other languages as well. For all three statements, the left hand side consists of a symbol or a symbol tuple. It may contain nested symbols and/or omitted symbols, indicated by an underscore. This is in fact obeyed by all assignments in Q#, including, e.g., qubit allocations and loop-variable assignments. To summarize:

• let is used to create an immutable binding. • mutable is used to create a mutable binding. • set is used to change the value of a mutable binding. For both kinds of binding, the types of the variables are inferred from the right-hand side of the binding. The type of a variable always remains the same and a set-statement cannot change it. Local variable can be declared as either being mutable or immutable, with some exceptions like loop- variables in for-loops for which the behavior is predefined and cannot be specified. Function and operation arguments are always immutably bound; in combination with the lack of reference types, as discussed in Section 8.6.2, that means that a called function or operation can never change any values on the caller side. Since the states of Qubit values are not defined or observ- able from within Q#, this does not preclude the accumulation of quantum side effects, that are observable (only) via measurements (see Section 8.6.3).

Independent on how a value is bound, the values themselves are im- mutable. It is worth pointing out explicitly that this in particular also holds for arrays and array items. In contrast to popular classical languages where 126 domain-specific language q#

arrays often are reference types, arrays - like all type - are value types in Q# and always immutable; they cannot be modified after initialization. Changing the values accessed by variables of array type thus requires ex- plicitly constructing a new array and reassigning it to the same symbol, see also Section 8.6.2 and Section 8.5.4 for more details.

Evaluate-and-Reassign Statements

Statements of the form set intValue += 1; are common in many other languages. Here, intValue needs to be a mutably bound variable of type Int. Similar statements in fact exist for a wide range of operators, see Section 8.5.1 for a full list of all available operators. More precisely, such evaluate-and-reassign statements exist for all operators where the type of the left-most sub-expression matches the expression type. This is the case for copy-and-update expressions (see Section 8.5.4), for binary logical and bitwise operators including right and left shift, for arithmetic expressions including exponentiation and modulus, as well as for concatenations. The set keyword in this case needs to be followed by a single mutable variable, which is inserted as the left-most sub-expression by the compiler. Section 8.5.6 contains other examples where expressions can be omitted in a certain context when a suitable expression can be inferred by the compiler.

8.4.3 Returns and termination

There are two statements available that conclude the execution of the current subroutine or the program; the return- and the fail-statement. The return-statement exits from the current callable and returns control to the callee. It changes the context of the execution by popping a stack frame. The statement always returns a value back to the context of the callee. The return value is evaluated before any terminating actions are performed and control is returned. Such terminating actions include, e.g., cleaning up and releasing qubits that have been allocated within the context of the callable. When executing on a simulator or validator, terminating actions often also include checks related to the state of those qubits, like, e.g., whether they are properly disentangled from all qubits that remain live. The return-statement at the end of a callable that returns a Unit value may be omitted. In that case, control is returned automatically when all statements have been executed and all terminating actions were performed. 8.4 statements 127

Callables may contain multiple return-statements – one for each possible execution path –, albeit operations containing multiple return-statements cannot be automatically inverted. The fail-statement on the other hand aborts the computation entirely. It corresponds to a fatal error that was not expected to happen as part of normal execution. Ideally, a fail- statement should collect and permit to retrieve information about the program state that facilitate diagnosing and remedying the source of the error. Of course, this requires support from the executing runtime and firmware.

8.4.4 Conditional branching

In contrast to embedded languages that predominately leverage a classi- cal host language to provide expressiveness for control flow constructs, Q# integrates these constructs seamlessly with quantum computations. As of the time of writing, conditional branching is expressed in the form of if-statements, that may optionally contain zero or more elif-clauses and an else-block that is executed if none of the conditions evaluate to true. From an execution perspective, the same constructs that represent such and if-statement could equally well be leveraged to express match-statements as they existing, e.g., in F# in the future. Especially in combination with additional types like discriminated unions, this could significantly enhance expressiveness and ease of use. Additionally, Q# even allows to express simple branching in the form of a conditional expression, see Section 8.5.2 for more details.

A tight integration between control-flow constructs and quantum com- putations of course poses a challenge for current hardware (see also Sec- tion 6.3). Here is where the Q#’s rigorous type system has its benefits; it allows for fine-grained control over how information that depends on measurement outcomes on the quantum device may propagate and impact the program continuation. Measurement results are represented by their own dedicated Result type within Q#. Since there are no automatic casts or even explicit casts between values of type Result and any other data types, the only way how the program continuation can depend on quantum computations is when a Result value is compared for equality or inequality against another Result. Restricting when and where such comparisons may happen thus gives 128 domain-specific language q#

the means to impose exactly the adequate restrictions to precisely match current hardware capabilities. IonQ’s trapped-ion-based quantum processors [225] for example cur- rently do not yet support branching based on measurement outcomes - a restriction that may well be lifted in the near future. For the time being, however, comparison for values of type Result will hence always result in a compilation error for Q# programs that are targeted to execute on that hardware. Honeywell’s quantum processors [226] support certain kinds of branching based on measurement outcomes. More specifically, they sup- port the kind of branching that could also be expressed in OpenQASM. Q# allows to express more general constructs. However, they can largely be translated into suitable calls of nested primitives. The imposed restric- tions are that values of type Result may only be compared as part of the condition within if-statements in operations. The conditionally executed blocks furthermore cannot contain any return statements or update mutable variables that are declared outside that block. Such restrictions can even be enforced at design-time, meaning whether or not a certain comparison or call is supported by the targeted hardware platform can be determined and displayed live while editing source code in an IDE.

Let’s look at an example for how if-statements are translated when targeting devices with limited control flow capabilities. For example, if res is the result of a measurement and q is a qubit, a conditional statement of the form

if (M(q) == res or res == Zero){ H(q); } would be translated into a call-statement

ApplyConditionally( [M(q)], [res], (H, q), (ApplyIfZeroCA(_,(H, _)), (res, q)) );

Upon execution, the first two arguments in the call to ApplyConditionally will be compared for equality. If they are equal, then the first item in the third argument H is invoked with the second time in the third argument (the qubit q). If they are unequal, then the first item in the forth argument 8.4 statements 129

ApplyIfZeroCA(_,(H, _)) is invoked with the second item in the forth argument. That invocation applies H to q if res is Zero. If the conditional block contains more than a single operation (H in the given example), then the content of that block is lifted upon compilation; a new operation is generated that contains the statements in that block and takes the captured values as arguments. The characteristics of the lifted code block are correctly inferred and preserved. Similar code transformations are done for more involved examples in order to generate suitable instruc- tions that are easier to process for the targeted quantum hardware. These compiler capabilities allow to execute even the repeat-until-success-based algorithm described in Ref. [206] on current quantum processors.

8.4.5 Loops and iterations

In terms of execution, loops that break based on a condition can be a huge challenge to process on quantum hardware if the condition depends on measurement outcomes; this poses an extra challenge since the length of the instruction sequence to execute is not known ahead of time. Despite that, Q# supports such constructs. The repeat-until-success pattern is a vital ingredient to a lot of quantum algorithms. Q# hence has its own dedicated statement for these: the repeat-statement. The statement consists of a first block to execute, after which a condition is evaluated. If the condition evaluates to true, the loop exists. If the condition evaluates to false, an addition block of statements defined as part of an optional fixup-block is executed prior to entering the next loop iteration. A pattern like this is used, e.g., in Refs. [39, 41, 206]. Despite their common presence in particular classes of quantum algo- rithms, current hardware does not yet provide native support for these kind of control flow constructs. Execution on a quantum processor hence currently requires to impose a maximum recursion depth, and comes at a heavy cost to the size of compiled binaries; the loop in a sense is un- rolled. It is translated and handled in much the same way as a recursion containing branchings based on measurement results would be. This means that the instructions for every possible execution path need to be explicitly represented in the compilation, hence the exponential increase in size of the binary.

In an effort to provide a more familiar looking statement for classical computations, a traditional while-loop is also supported, albeit only within 130 domain-specific language q#

functions to discourage the use of loops that break based on a condi- tion when dealing with quantum computation, unless they are needed. With more sophisticated tracking regarding when a value depends on the quantum parts of a computation, it will be possible to allow the use of while-loops within operations as well, as long as the condition does not depend on quantum instructions within the body of the loop. There is also no reason not to support other kinds of commonly available loop constructs within functions in the future.

Much more benign on the other hand are loops that merely iterate over a sequence of values. Q# hence carefully distinguishes between repeat- statements and for-loops. A for-loop in Q# does not break based on a condition, but instead corresponds to what is often expressed as foreach or iter in other languages. There are furthermore no break- or continue- primitives in Q#, such that the length of the loop is perfectly predictable as soon as the value to iterate over is known. Of course, the length of the iteration may still depend on runtime information. However, at least in principle it is possible to make estimates regarding that length, facilitated by the fact that there is no concept of standard input during quantum execution, and programs are compiled for a given set of inputs. There are currently two data types in Q# that support iteration: arrays and ranges. The same deconstruction rules apply to the defined loop variable(s) as to any other variable assignment, such as bindings in let-, mutable-, set-, using- and borrowing-statements. The loop variables themselves are immutably bound, cannot be reassigned within the body of the loop, and go out of scope when the loop terminates.

8.4.6 Call statements

Call statements are a very important part of any programming language. While operation and function calls can be used as an expression anywhere as long as the returned value is of a suitable type, they can also be used as statements if they return Unit. The usefulness of calling functions in this form primarily lays in debugging, as explained in Section 8.6.4, whereas such operation calls are one of the most common constructs in any Q# pro- gram. At the same time, operations can only be called from within other operations and not from within functions (see Section 8.6.3). 8.4 statements 131

With callables being first-class values, call statements in a sense are the generic way of supporting patterns that aren’t common enough to merit their own dedicated language construct or dedicated syntax has not (yet) been introduced for other reasons. Examples for library methods that serve exactly that purpose are ApplyIf, that invokes an operation conditional on a classical bit being set, ApplyToEach, that applies a given operation to each element in an array, and ApplyWithInputTransformation shown below to give to just a few.

operation ApplyWithInputTransformation<’TArg, ’TIn>( fn : (’TIn -> ’TArg), op : (’TArg => Unit), input : ’TIn ): Unit {

op(fn(input)); }

ApplyWithInputTransformation takes a function fn, an operation op, and an input value as argument, applies the give function to the input, before invoking the given operation with the value returned form the function (see the call statement in Line 7).

We have also already mentioned functors; factories that allow to access particular specialization implementations of a callable. For an operation U that defines a unitary transformation of the quantum state, Adjoint U ac- cesses the implementation of U† and Controlled U accesses the implemen- tation that applies U conditional on all qubits in an array of control qubits being in a state |1i. Concretely, if cs contains an array of qubits, and q1 and q2 are two qubits, then the operation call Controlled SWAP(cs, (q1, q2)), with SWAP as defined in Listing 8.2, exchanges the state of q1 and q2 if all qubits in cs are in a |1i state. As mentioned in Section 8.3.3, for the compiler to be able to auto-generate the specializations to support particular functors usually requires that the called operations support those functors as well. The exception are calls in outer blocks of conjugations, which always need to support the Adjoint functor but never need to support the Controlled functor (see Section 8.4.7), and self-adjoint operations, which support the Adjoint functor without imposing any additional requirements on the individual calls. More so- phisticated generation directives should allow to relax that requirement in further contexts. 132 domain-specific language q#

8.4.7 Other quantum-specific patterns

Another quantum-specific pattern that is worth highlighting due to its om- nipresence in quantum computations are conjugations. Conjugations in the mathematical sense are patterns of the form U†VU for unitary transforma- tions U and V. That pattern is especially relevant due to the particularities of quantum memory: To leverage the unique assets of quantum, computa- tions build up quantum correlations – i.e. entanglement. However, that also means that once qubits are no longer needed for a particular subroutine, they cannot easily be reset and released, since observing their state would impact the rest of the system. For that reason, the effects of a previous computation commonly need to be reversed prior to being able to release and reuse quantum memory. What’s more is that there is a certain flexibility to when exactly to per- form such cleanup, giving room for optimizations, similar to pebbling games [117]. Additionally, it is useful to recognize the pattern when auto- generating a controlled version of an operation, since rather than having to control all three transformations it is sufficient to merely condition the execution of V on the state of the control qubits. This can easily be seen by remembering that if V is not applied, then U†U = 1 evaluates to the identity and no transformation is applied. Having a dedicated representation for expressing conjugations makes sense not just from an optimization perspective but certainly also for user convenience, saving the trouble of having the explicitly express the cleanup in source code and making code more concise.

The example in Listing 8.3 originates in the arithmetic library and demon- strates the usage of such a conjugation in practice. The statements in the within-block starting in Line 11 are applied first, followed by the state- ments in the apply-block starting in Line 16, and finally the automatically generated adjoint of the within-block is applied to clean up the temporarily used helper qubit anc. The example doesn’t illustrate that both blocks may contain arbitrary classical computations as well. The only exception is that mutably bound variables that are used as part of the within-block may not be reassigned as part of the apply-block. The reason for this restriction is less a technical one, but more to prevent confusion regarding the expected behavior in this case. What is not yet supported for technical reasons is to permit to return from within the apply-block. It should be possible to support this in the 8.4 statements 133 future. The expected behavior in this case is to evaluate the returned value before the adjoint of the within-block is executed, any qubits going out of scope are released (anc in this case), and the control is returned to the callee. In short, the statement should behave similarly to a try-finally pattern in C#. However, the necessary functionality is not yet implemented.

1 operation ApplyXOrIfGreater(

2 lhs : LittleEndian,

3 rhs : LittleEndian,

4 res : Qubit

5 ): Unit is Adj + Ctl {

6

7 let (x, y) = (lhs!, rhs!);

8 let shuffled = Zip3(Most(x), Rest(y), Rest(x));

9

10 using (anc = Qubit()) {

11 within {

12 ApplyToEachCA(X, x + [anc]);

13 ApplyMajorityInPlace(x[0], [y[0], anc]);

14 ApplyToEachCA(MAJ, shuffled);

15 }

16 apply {

17 X(res);

18 CNOT(Tail(x), res);

19 }

20 }

21 }

Listing 8.3 The operation ApplyXOrIfGreater is defined in the Q# arithmetic library. It maps |lhsi|rhsi|resi → |lhsi|rhsi|res ⊕ (lhs > rhs)i, i.e. it coherently applies an XOR to the given qubit res if the quantum integer represented by lhs is greater than the one in rhs. The two integers are expected to be represented in little endian encoding, as indicated by the usage of the corresponding data type. The temporarily used storage qubit anc needs to be cleaned up before it can be released at the end of the using-block. Correspondingly, the transformation defined by the statements in the within-block need to be inverted after executing the ones in the apply-block. This is indicated by the use of the within-apply statement (a.k.a. conjugation) and done automatically. 134 domain-specific language q#

Currently, Q# requires all allocations of quantum memory to be explicit. One might wonder whether allowing for implicit qubit allocations and deallocations would make sense in the future, in cases where additional scratch space is temporarily needed and readily cleaned up again. An example where this is the case is the construction of quantum oracles based on reversible classical functions. Introducing the concept of implicit quantum memory management requires further thorough considerations to ensure that there are no adverse effects for optimizing the quantum execution, and that the integration into the language is done in a consistent and holistic manner rather than in an ad-hoc way for selective cases only.

8.5 expressions

Section 8.5.1 gives a good overview over what expressions exist in Q#, and the subsequent subsections discuss some of the most interesting ones in more detail. In addition to the expressions listed in Section 8.5.1, identifiers, value literals for all types except qubits, and new array expressions, which instantiate an array of the given length, are used within Q# programs, but won’t be discussed any further.

8.5.1 Operators, modifiers, and combinators

Operators in a sense are nothing but dedicated syntax for particular func- tions. Even though Q# is not yet expressive enough to formally capture the capabilities of each operator in the form of a backing function declaration, that should be remedied in the future (see also Section 8.6.6). There are currently no operators that correspond to operations; i.e. all Q# operators at the time of writing are fully deterministic and do not have any side effects.

Precedence and associativity define the order in which operators are applied. Operators with higher precedence will be bound to their argu- ments (operands) first, while operators with the same precedence will bind be bound in the direction of their associativity. For example, the expres- sion 1+2*3 according to the precedence for addition and multiplication is equivalent to 1+(2*3), and 2^3^4 equals 2^(3^4) since exponentiation is right-associative. Table 8.1 lists the available operators, as well as their precedence and associativity. 8.5 expressions 135

Description Syntax Operator Associativity Precedence

copy-and-update operator 3 w/ <- ternary left 1 range operator 4 .. infix left 2 conditional operator 5 ?| ternary right 5 logical OR or infix left 10 logical AND and infix left 11 bitwise OR ||| infix left 12 bitwise XOR ^^^ infix left 13 bitwise AND &&& infix left 14 equality == infix left 20 inequality != infix left 20 less-than-or-equal <= infix left 25 less-than < infix left 25 greater-than-or-equal >= infix left 25 greater-than > infix left 25 right shift >>> infix left 28 left shift <<< infix left 28 addition or concatenation + infix left 30 subtraction - infix left 30 multiplication * infix left 35 division / infix left 35 modulus % infix left 35 exponentiation ^ infix right 40 bitwise NOT ~~~ prefix right 45 logical NOT not prefix right 45 negative - prefix right 45

Table 8.1 Overview over available operators, their precedence and associativity. Additional modifiers and combinators are listed in Table 8.2 and bind tighter than any of these operators.

3 See Section 8.5.4 for more details. 4 See Section 8.5.6 for more details. 5 See Section 8.5.2 for more details. 136 domain-specific language q#

Copy-and-update expressions necessarily need to have the lowest prece- dence to ensure a consistent behavior of the corresponding evaluate-and- reassign statement (see Section 8.4.2). Similar considerations hold for the range operator to ensure a consistent behavior of the corresponding contex- tual expression explained in Section 8.5.6.

Logical Operators

Logical operators are expressed as keywords. Q# supports the standard logical operators for AND, OR, and NOT. There is currently no operator for a logical XOR. All of these operators act on operands of type Bool, and result in an expression of type Bool as well. As it is common in most languages, the evaluation of AND and OR short-circuits, meaning if the first expression of OR evaluates to true, the second expression is not evaluated, and the same holds if the first expression of AND evaluates to false. The behavior of conditional expressions in a sense is similar, in that only ever the condition and one of the two expressions is evaluated.

Bitwise Operators

Bitwise operators are expressed as three non-letter characters. In addition to bitwise versions for AND, OR, and NOT, a bitwise XOR exists as well. They expect operands of type Int or BigInt, and for binary operators, the type of both operands has to match. The type of the entire expression equals the type of the operand(s). Additionally, left- and right-shift operators exist, multiplying or dividing the given left-hand-side (lhs) expression by powers of two. The expression lhs <<< 3 shifts the bit representation of lhs by three, meaning lhs is multiplied by 23, provided that is still within the valid range for the data type of lhs. The lhs may be of type Int or BigInt. The right-hand-side expression always has to be of type Int. The resulting expression will be of the same type as the lhs operand.

Arithmetic Operators

Arithmetic operators are addition, subtraction, multiplication, division, negation, exponentiation. They can be applied to operands of type Int, BigInt, or Double. Additionally, for integral types (Int and BigInt) an operator computing the modulus is available. For binary operators, the type of both operands has to match, expect for exponentiation; an exponent for 8.5 expressions 137 a value of type BigInt always has to be of type Int. The type of the entire expression matches the type of the left operand. Currently, Q# does not support any automatic conversions between arithmetic data types – or any other data types for that matter. This has the benefit of avoiding accidental errors, but also constitutes an inconvenience. While certain data types such as Result are used to restrict how runtime information can propagate and hence likely will never be automatically cast, we may revise the casting and conversion behavior between other data types such as, e.g., between Int and Double in the future.

Quantitative Comparison

The operators less-than, less-than-or-equal, greater-than, and greater-than- or-equal define quantitative comparisons. They can only be applied to data types that support such comparisons. As of the time of writing, these are the same data types that can also support arithmetics.

Equality Comparison

Equality and inequality comparison is currently limited to the following data types: Int, BigInt, Double, String, Bool, Result, Pauli, and Qubit. The comparison for equality of arrays, tuples, ranges, user defined types, or callables is currently not supported. There are no fundamental issues with allowing comparisons of ranges, as well as for arrays, tuples, and user defined types, provided their items support comparison; it is merely a matter of not yet having been implemented. For all types, the comparison is by value, meaning two values are considered equal if all of their items are, i.e. for

let arr1 = [0,0,0]; let arr2 = new Int[3]; the expression arr1 == arr2 should evaluate to true since the default value for an integer is 0 and both arrays thus contain the same items. The same should hold for values of user defined type, with the caveat that their type also needs to match. Supporting the comparison of values of type Range follows the same logic; they should be equal as long as they produce the same sequence of integers, meaning the two ranges

let r1 = 0..2..5; // generates the sequence 0,2,4 let r2 = 0..2..4; // generates the sequence 0,2,4 138 domain-specific language q#

should be considered equal.

Conversely, there is a good reason not to allow the comparison of callables as the behavior would be ill-defined. Suppose we will introduce the capa- bility to define functions locally via a possible syntax

let f1 = (x -> Bar(x)); // not yet supported let f2 = Bar;

for some globally declared function Bar. The first line defines a new anony- mous function that takes an argument x and invokes a function Bar with it and assigns it to the variable f1. The second line assigns the function Bar to f2. Since invoking f1 and invoking f2 will do the same thing, it should be possible to replace those with each other without changing the behavior of the program. This wouldn’t be the case if the equality comparison for functions was supported and f1 == f2 evaluates to false. If conversely f1 == f2 were to evaluate to true, then this leads to the question of deter- mining whether two callable will have the same side effects and evaluate to the same value for all inputs. Clearly, it is not possible to reliably determine that. Hence, if we would like to be able to replace f1 with f2, we can’t allow equality comparisons for callables.

Concatenation

Concatenations are supported for arrays and values of type String. Even though a common base type for all array items is determined when con- structing an array literal, concatenating two arrays requires that both operands are of the exact same type - same as for Strings. This is due to the fact that arrays are treated as invariant (see Section 8.6.7). The type of the entire expression matches the type of the operands.

Modifiers and combinators

In addition to the operators listed in Table 8.1, there are other constructs that can be applied to certain expressions only. We can assign them an artificial precedence to capture their behavior. We’ll refer to these constructs as modifiers. One or more modifiers can be applied to expressions that are either identifiers, array item access expressions, named item access expressions, or an expression within parenthesis which is the same as a single item tuple (see Section 8.6.1). They can either precede (prefix) the expression or follow (postfix) the expression. They are thus special unary 8.5 expressions 139

operators that bind tighter than function or operation calls, but less tight than any kind of item access. In a sense, function calls and item access can also be seen as a special kind of operator; we refer to them as combinators. Functors are treated at prefix modifiers. Additionally, the unwrap operator (’!’) is treated as a postfix modifier, the purpose of which is explained in Section 8.5.5. The artificial precedence of these operators is listed in Table 8.2, which also shows how the precedence of operators and modifiers relates to how tight item access combinators (’[’,’]’ and ’::’ respectively) and call combina- tors (’(’, ’)’) bind.

Description Syntax Operator Associativity Precedence

Call combinator 6 () n/a right 900 Adjoint functor 7 Adjoint prefix right 950 Controlled functor 8 Controlled prefix right 950 Unwrap application 9 ! postfix left 1000 Array item access [] n/a left 1100 Named item access 10 :: n/a left 1100

Table 8.2 Overview over expression modifiers and combinators, as well as their attributed precedence and associativity.

Suppose we have a unitary operation DoNothing as defined in Section 8.3.3, a callable GetStatePrep that returns a unitary operation, and an array algorithms containing items of type Algorithm defined as follows

newtype Algorithm = ( Register : LittleEndian, Initialize : Transformation, Apply : Transformation );

6 See Section 8.5.3 and Section 8.4.6 for more details. 7 See also Section 8.4.6. 8 See also Section 8.4.6. 9 See Section 8.5.5 for more details. 10 See Section 8.5.5 for more details. 140 domain-specific language q#

newtype Transformation = ( LittleEndian => Unit is Adj + Ctl );

where LittleEndian is defined in Section 8.3.1. Then the following expres- sions are all valid:

(GetStatePrep())(arg) (Transformation(GetStatePrep()))!(arg) Adjoint DoNothing() Controlled Adjoint DoNothing(cs, ()) Controlled algorithms[0]::Apply!(cs, _) algorithms[0]::Register![i] Looking at the precedences defined in Table 8.2, we see that the parentheses around (Transformation(GetStatePrep())) are necessary for the subse- quent unwrap operator to be applied to the Transformation value rather than the returned operation. Similarly, the syntax GetStatePrep()(arg) doesn’t lead to a valid expression; parenthesis are required around the GetStatePrep call in order to invoke the returned callable. Functor applica- tions on the other hand don’t require parentheses around them in order to invoke the corresponding specialization. Neither do array or named item access expressions, such that an expression arr2D[i][j] is perfectly valid, just like algorithms[0]::Register![i] is.

8.5.2 Conditional expressions

Conditional expressions consist of three sub-expressions, where the left- most one is of type Bool and determines which one of the two other sub-expressions is evaluated. They are of the form

cond ? ifTrue | ifFalse

The types of the ifTrue and the ifFalse expression have to have a com- mon base type. Independent of which one of the two ultimately yields the value to which the expression evaluates, its type will always match the determined base type.

While such expressions are rather convenient to use, they also require a rather non-trivial translation upon compilation, since there is no correspond- ing construct in the planned intermediate representation that a Q# program is intended to compile into. Conditional expressions hence need to be 8.5 expressions 141

converted to full-fledged if-statements during compilation. Naively, the use of a conditional expression could be expressed as an assignment to a mutable variable that is then substituted where the conditional expression was used. With CBT standing for the determined common base type, the native translation would look like mutable __tempVar1__ = Default(); if (cond) { set __tempVar1__ = ifTrue; } else { set __tempVar1__ = ifFalse; } followed by the original code with the conditional expression replaced by __tempVar1__. Remembering the restrictions that apply to if-statements when targeting a program for execution on current quantum hardware described in Section 8.4.4, this translation is clearly suboptimal; It violates these restrictions if the condition contains a comparison of a Result value even if the targeted hardware fundamentally is capable of processing the necessary branching based on measurement results.

Let’s look at whether, when and how we can do better by considering how conditional expressions can be used. We will start by looking at how they can be used within expressions. Section 8.5.1 gives an overview over all possible expressions in Q#. Considering any unary prefix operator or modifier h#i, it is easy to see that h#i (cond ? ifTrue | ifFalse) is the same as cond ? (h#i ifTrue) | (h#i ifFalse). Clearly, also any unary postfix operator or modifier can be pulled into the conditional expression in a similar fashion. Thanks to the fact that only ever one “side” of the conditional expression is evaluated, the same also holds for all binary operators that do not short-circuit11, as well as for copy-and-update expressions (which are built using the only other ternary operator) and all combinators. More care needs to be taken for binary operators that short-circuit, i.e. for the logical AND and OR. In the case where the conditional expression is used on the left-hand-side, they can be pulled in as well. In the case where the conditional is on the right-hand-side, however, we need to preserve the original expression. Nested conditionals need to be translated into nested if-statement (unless the nesting merely

11 See Section 8.5.1. 142 domain-specific language q#

occurs as part of the condition), with the corresponding options regarding flattening them into an if-statement with several elif-clauses. To summarize these considerations in EBNF notation, any expression can be brought into the following form:

conditional = "(", expr, "?", expr, "|", expr, ")" expr = "branchFree" | [expr ("and"|"or")], conditional

where "branchFree" is an expression that does not contain any conditionals. Nested conditionals translate into nested if-statements in a straightforward manner. For the rest of this section we will hence limit the discussion to considering expressions of the form cond ? ifTrue | ifFalse which we will abbreviate with condEx, leaving it up to the reader to generalize the translation to include arbitrary expressions containing one or more condi- tionals.

We proceed to look at each statement to determine whether we can translate any contained conditional expressions in a way that would permit to execute it on quantum hardware with limited support for branching based on measurement results. It should be stated explicitly that the naive translation given at the beginning of this section will do perfectly fine as long as the condition does not contain any Result values. Expression statements don’t require further considerations beyond what has already been discussed regarding conditionals within expressions, and neither do conjugations as they only contain statement blocks and no additional expressions. In the case of a return-statement, there is not much we can do without the firmware/hardware support that would allow to evaluate all expres- sions that make use of the returned value. Of course, we could attempt to propagate the condition into the caller by adding the corresponding branching to the program continuation, but this quickly becomes infeasible. A fail-statement on the other hand aborts all further computation and we can pull the fail-statement into the generated if-statement. The statement fail condEx; then simply becomes

if (cond) { fail ifTrue; } else { fail ifFalse; }

If a value assigned as part of a let-, mutable-, or set-statement depends on a conditional expression, we potentially face the same challenges as for a return-statement. As long as no return- or set-statements depend 8.5 expressions 143 on the bound variable(s) and the assignment is not part of a repeat-loop, we can enclose all subsequent computations that make use of the bound variable(s) into a suitable if-statement. If this is not possible or impractical, supporting such occurrences then requires that all subsequent computations that depend on the bound variable(s) and impact the executed quantum transformations can be evaluated while qubits remain live. The same holds for the loop variable(s) of a for-loop iterating through a sequence that depends on the evaluation of a conditional expression. Conditionals in all other statements on the other hand can be supported as long as the statement itself is executable; any value resulting from evaluating a conditional expression is never assigned such that it is only ever used within the statement itself and not as part of the statement body or subsequent statements. The idea here is to convert the expression to a suitable if-statement and pull the assignment of each sub-expression into the corresponding branch – if such an assignment is even necessary. In the case of a conditional expression within an existing if-statement, or a using- or borrowing-statement, this translation is straightforward. A using- statement of the form using (qs = Qubit[condEx]){ ... } for example is translated into

if (cond) { using (qs = Qubit[ifTrue]) { // some code } } else { using (qs = Qubit[ifFalse]) { // some code } } A mutable assignment is required to deal with loops that break based on a condition. Nonetheless, in contrast to for-loops, as well as let-, mutable-, set-, and return-statements, we can be sure that the bound variable cannot impact the program flow beyond determining whether to enter the next iteration. Currently, such loops need to be unrolled in order to be executable, which requires imposing a (fairly small) limit on the maximal number of iterations (see also Section 8.4.5). 144 domain-specific language q#

8.5.3 Partial applications

Currently, callables can only be declared at a global scope. By default, they are publicly visible, i.e. they can be used anywhere in the same project and in a project that references the assembly in which they are declared. Access modifiers allow to restrict their visibility to the current assembly only, such that implementation details can be changed later on without breaking code that relies on a certain library. However, often there is a need to construct a callable for one-time use only, meaning no other piece of code will make use of it. Having to declare it on a global scope is hence inconvenient and limits the flexibility in how to structure and organize code.

Q# currently provides one rather powerful mechanism to construct new callables on the fly: partial applications. Partial application refers to that some of the argument items to a callable are provided while others are still missing as indicated by an underscore. The result is a new callable value that takes the remaining argument items, combines them with the already given ones, and invokes the original callable. Naturally, partial application preserves the characteristics of a callable, i.e. a callable constructed by partial application supports the same functors as the original callable. In contrast to other functional languages, Q# allows any subset of the parameters to be left unspecified, not just a final sequence, which ties in more naturally with the design to have each callable take and return exactly one value. For a function Foo whose argument type is (Int,(Double, Bool), Int) for instance, Foo(_, (1.0, _), 1) is a func- tion that takes an argument of type (Int,(Bool)), which is the same as an argument of type (Int, Bool), see Section 8.6.1. Because partial application of an operation does not actually evaluate the operation, it has no impact on the quantum state. This means that building a new operation from existing operations and computed data may be done in a function; this is useful in many adaptive quantum algorithms and in defining new control flow constructs.

Implementations such as the one below are at present time quite common in the Q# libraries. It shows both the usefulness and limitations of partial applications. The operation ApplyBound defined in the Q# standard library takes an array of operations and one after another applies them to the given argument. Access to ApplyBound is limited to the compilation unit. Projects that have a reference to standard library cannot access it, but instead access 8.5 expressions 145 the operation Bound, which given an array of operations returns an new operation that implements their sequential application. internal operation ApplyBoundCA<’T> ( ops : (’T => Unit is Adj + Ctl)[], arg : ’T) : Unit is Adj + Ctl {

for (op in ops) { op(arg); } }

function BoundCA<’T> ( ops : (’T => Unit is Adj + Ctl)[] ) : (’T => Unit is Adj + Ctl){

return ApplyBoundCA(ops, _); } We see that there is no reason to define ApplyBound on the global scope, and a local declaration within Bound would do fine. Some of the most anticipated future features are hence local declarations and lambda expressions. Local declarations, in contrast to globally defined callables, would have to be declared in order and wouldn’t be recursive by default. There is no need for locally declared callables to explicitly specify their return type, though the type(s) of the argument (items) would still need to be annotated, and their characteristics (see Section 8.6.5) could also be inferred. Lambda expressions are even more convenient in a sense; for one, being expressions they can be used in almost any context, and furthermore, that context in certain cases additionally allows to infer the type(s) of the argument (items). In the meantime, partial applications allow to express the same func- tionality and provide a neat and compact way to cover some of the most common use cases.

8.5.4 Copy-and-update expressions

To reduce the need for mutable bindings, Q# supports copy-and-update expressions for value types with item access. Such expressions eliminate the need for one or several dedicated set-statements in certain cases. There are currently two such types available; user defined types, which allow to 146 domain-specific language q#

access items via name, and arrays, which allow to access items via index. Such copy-and-update expressions consist of a ternary operator and are of the form

expression w/ itemAccess <- expression with suitable restrictions regarding the type of the inner expressions. The use of the syntax w/ is rooted in the short notation commonly used for “with”. For user defined types, itemAccess denotes the name of the item that diverges from the original value. The reason that this is not simply another expression of suitable type is that the ability to simply use the item name without any further qualification is limited to this context; it is one of two contextual expressions in Q#, see also Section 8.5.6. For arrays, itemAccess indeed is just any expression of a suitable type. In the interest of consistency, the same types that are valid for array slicing are valid in this context; more concretely, the itemAccess expression currently can be of type Int, Range, and in the future possibly also of type Int[], which is not yet supported neither in this context nor in array slicing expressions. Copy-and-update expressions allow efficient creation of new arrays based on existing ones. The implementation for copy-and-update expressions avoids copying the entire array but merely duplicates the necessary parts to achieve the desired behavior, and performs an in-place modification if possible. Suitable means to initialize an array via, e.g., an initialization function or similar means are provided by the standard libraries. Array initialization via a call to such a core function does not incur additional overhead due to immutability.

In terms of precedence, the copy-and-update operator is left-associative and has lowest precedence, and in particular lower precedence than the range operator (‘..‘) or the ternary conditional operator (‘?|‘). The chosen left associativity allows easy chaining of copy-and-update expressions:

let model = Default() w/ Structure <- ClassifierStructure() w/ Parameters <- parameters w/ Bias <- bias; Like for any operator that constructs an expression that is of the same type as the left-most expression involved, the corresponding evaluate-and- 8.5 expressions 147

reassign statement12 is available. The two statements below for example achieve the following: The first statement declares a mutable variable arr and binds it to the default value of an integer array. The second statement then builds a new array with the first item (with index 0) set to 3, and reassigns it to arr.

mutable arr = new Int[3]; // arr contains [0,0,0] set arr w/= 0 <- 10; // arr contains [3,0,0] The second statement is nothing but a short-hand for the more verbose syntax set arr = arr w/ 0 <- 10;.

8.5.5 Item access for user defined types

Section 8.3.1 describes how to define custom types, containing one or more named or anonymous items. The contained items can be accessed via their name or by deconstruction, illustrated by the following statements that may be used as part of a operation or function implementation (see also Section 8.4.2):

let complex = Complex(1.,0.); // create a value of type Complex let (re, _) = complex!; // item access via deconstruction let im = complex::Imaginary; // item access via name

The item access operator (’::’) retrieves named items. While named items can be accessed by their name or via deconstruction, anonymous items can only be accessed by the latter. Since deconstruction relies on all of the contained items, the usage anonymous items is discourage when these items need to be accessed outside the compilation unit in which the type is defined. Access via deconstruction makes use of the unwrap operator (’!’). That operator will return a tuple of all contained items, where a single item tuple is equivalent to the item itself (see Section 8.6.1).

8.5.6 Contextual and omitted expressions

We have already seen an example for an expression that is only valid in a certain context, namely the usage of item names in copy-and-update expressions without having to qualify them (see Section 8.5.4). Furthermore, in Section 8.4.2 we have seen that expressions can be omitted when they

12 See Section 8.4.2. 148 domain-specific language q#

can be inferred and automatically inserted by the compiler, as it is the case in evaluate-and-reassign statements. There is one more example for both; open-ended ranges are valid only within a certain context, and the compiler will translate them into normal Range expressions during compilation by inferring suitable boundaries. A value of type Range generates a sequence of integers, specified by a start, optionally a step, and an end value. For example, the Range literal expressions 1..3 generates the sequence 1,2,3, and the expression 3..-1..1 generates the sequence 3,2,1. Ranges can be used for example to create a new array from an existing one by slicing:

let arr = [1,2,3,4]; let slice1 = arr[1..2..4]; // contains [2,4] let slice2 = arr[2..-1..0]; // contains [3,2,1] No infinite ranges exist in Q#, such that start and end value always need to be specified, expect when a Range is used to slice an array. In that case, the start and/or end value of the range can reasonably be inferred. Looking at the array slicing expressions above, it is reasonable for the compiler to assume that the intended range end should be the index of the last element in the array if the step size is positive. If the step size on the other hand is negative, then the range end likely should be the index of the first element in the array, i.e. 0. The converse holds for the start of the range. Q# hence allows to use open-ended ranges within array slicing expressions:

let slice3 = arr[1..2...]; // contains [2,4] let slice4 = arr[...-2..0]; // contains [4,2] let slice5 = arr[...-1...]; // contains [4,3,2,1] Of course, the information whether the range step is positive or negative is runtime information. The compiler hence inserts a suitable expression that will be evaluated at runtime. For omitted end values, the inserted expression is step < 0 ? 0 | Length(arr)-1, and for omitted start values it is step < 0 ? Length(arr)-1 | 0, where step is the expression given for the range step, or 1 if no step is specified.

8.6 type system

With the focus for quantum algorithm being more towards what should be achieved rather than on a problem representation in terms of data structures, taking a more functional perspective on language design is a natural choice. At the same time, the type system is a powerful mechanism that can be 8.6 type system 149 leveraged for program analysis and other compile-time checks that facilitate formulating robust code. All in all, the Q# type system is fairly minimalist, in the sense that there isn’t an explicit notion of classes or interfaces as one might be used to from classical languages like C# or Java. We also took a somewhat pragmatic approach making incremental progress, such that certain construct are not yet fully integrated into the type system. An example are functors, which can be used within expressions but don’t yet have a representation in the type system. Correspondingly, they cannot currently be assigned or passed as arguments. There are similar loose ends related to type parametrized callables - a concept that is not commonly introduced as early as in the first version of a new language. While there is good reason for that, we strike a balance between making good design calls aligning with the principles outlined in Section 8.1 and user expectations that often compare against the functionalities available in well-established classical languages.

Rather than discussing the entire type system in detail, we focus on certain aspects that are particularly interesting.

8.6.1 Singleton tuple equivalence

To avoid any ambiguity between tuples and parentheses that group sub- expressions, a tuple with a single element is considered to be equivalent to the contained item. This includes its type; for instance, the types Int, (Int), and ((Int)) are treated as identical, as are the values 5, (5) and (((5))). Since there is no dynamic dispatch or reflection in Q# and all types in Q# are resolvable at compile-time, singleton tuple equivalence can be readily implemented during compilation.

8.6.2 Immutability

All types in Q# are value types. Q# does not have a concept of a reference or pointer. Instead, it allows to reassign a new value to a previously declared variable via a set-statement. There is no distinction in behavior between reassignments for, e.g., variables of type Int or variables of type Int[]. To give an explicit illustration, consider the following sequence of statements:

mutable arr1 = new Int[3]; let arr2 = arr1; set arr1 w/= 0 <- 3; 150 domain-specific language q#

The first statements instantiates a new arrays of integers [0,0,0] and assigns it to arr1. Line 2 assigns that value to a variable with name arr2. Line 3 then creates a new array instance based on arr1 with the same values except for the value at index 0 which is set to 3. The newly created array is then assigned to the variable arr1. The last line makes use of the abbreviated syntax for evaluate-and-reassign statements (see Section 8.4.2), and could equivalently have been written as set arr1 = arr1 w/ 0 <- 1;. After executing the three statements, arr1 will contain the value [3,0,0] while arr2 remains unchanged and contains [0,0,0]. Q# clearly thus distinguishes the mutability of a handle and the behavior of a type. Mutability within Q# is a concept that applies to a symbol rather than a type or value; it applies to the handle that allows one to access a value rather than to the value itself. It is not represented in the type system, implicitly or explicitly.

Of course, this is merely a description of the formally defined behav- ior; under the hood, the actual implementation uses a reference counting scheme to avoid copying memory as much as possible. The modification is specifically done in-place as long as there is only one currently valid handle that accesses a certain value.

8.6.3 Quantum-specific data types

In addition to the Qubit type explained in detail below, there are two other types that are somewhat specific to the quantum domain: Pauli and Result. Values of type Pauli specify a single-qubit Pauli operator; the possibilities are PauliI, PauliX, PauliY, and PauliZ. Pauli values are used primarily to specify the basis for a measurement. The Result type specifies the result of a quantum measurement. Q# mirrors most quantum hardware by providing measurements in products of single-qubit Pauli operators; a Result of Zero indicates that the +1 eigenvalue was measured, and a Result of One indicates that the −1 eigenvalue was measured. That is, Q# represents eigenvalues by the power to which −1 is raised. This convention is more common in the quantum algorithms community, as it maps more closely to classical bits.

Qubits

Q# treats qubits as opaque items that can be passed to both functions and operations, but that can only be interacted with by passing them 8.6 type system 151 to instructions that are native to the targeted quantum processor. Such instructions are always defined in the form of operations, since their intent is indeed to modify the quantum state. That functions cannot modify the quantum state despite that qubits can be passed as input arguments is enforced by the restriction that functions can only call other functions, and cannot call operations. The Q# libraries are compiled against a standard set of intrinsic opera- tions, meaning operations which have no definition for their implementation within the language. Upon targeting, the implementations that expresses them in terms of the instructions that are native to the execution target are linked in by the compiler. A Q# program thus combines these operations as defined by a target machine to create new, higher-level operations to express quantum computation. In this way, Q# makes it very easy to express the logic underlying quantum and hybrid quantum–classical algorithms, while also being very general with respect to the structure of a target machine and its realization of quantum state. Within Q# itself, there is no type or construct in Q# that represents the quantum state. Instead, a qubit represents the smallest addressable physical unit in a quantum computer. As such, a qubit is a long-lived item, so Q# has no need for linear types. Importantly, we hence do not explicitly refer to the state within Q#, but rather describe how the state is transformed by the program, e.g., via application of operations such as X and H. Similar to how a graphics shader program accumulates a description of transformations to each vertex, a quantum program in Q# accumulates transformations to quantum states, represented as entirely opaque reference to the internal structure of a target machine. A Q# program has no ability to introspect into the state of a qubit, and thus is entirely agnostic about what a quantum state is or on how it is realized. Rather, a program can call operations such as Measure to learn information about the quantum state of the computation.

8.6.4 Callables

As elaborated in more detail in Section 8.6.3, quantum computations are executed in the form of side effects of operations that are natively supported on the targeted quantum processor. These are in fact the only side effects in Q#; since all types are immutable, there are no side effect that impact a value that is explicitly represented in Q#. Hence, as long as the implementation of a certain routine does not directly or indirectly call any of these natively 152 domain-specific language q#

implemented operations, its execution will always produce the same output given the same input. Q# allows to explicitly split out such purely deterministic computa- tions into functions. Since the set of natively supported instructions is not fixed and built into the language itself, but rather fully configurable and expressed as a Q# library, determinism is guaranteed by requiring that functions can only call other functions, but cannot call any operations. Additionally, native instructions that are not deterministic, e.g., because they impact the quantum state are represented as operations. With these two restrictions, function can be evaluated as soon as their input value is known, and in principle never need to be evaluated more than once for the same input. Q# therefore distinguishes between two types of callables: operations and functions. All callables take a single (potentially tuple-valued) argument as input and produce a single value (tuple) as output.

As of the time of writing, there is little difference between operations and functions beside this determinism guarantee. Both are first-class values that can be passed around freely; they can be used as return values or arguments to other callables, as illustrated by the example below.

function Pow<‘T>(op:(‘T => Unit, pow : Int) : (‘T => Unit){ return PowImpl(op, pow, _); } They can be instantiated based on a type parametrized definition such as, e.g., the type parametrized function Pow above (see Section 8.6.6, and they can be partially applied as done in Line 2 in the example (see Section 8.5.3 for more details). However, splitting out computations that cannot possibly impact the quantum state allows to build out more powerful and expressive language constructs in the future; constructs that are commonly used in conventional programming languages but would be hard to support on quantum hardware.

8.6.5 Operation characteristics

In addition to the information about in- and output type, the operation type contains information about the characteristics of an operation. This infor- mation for example describes what functors are supported by the operation. Additionally, the internal representation also contains information that is 8.6 type system 153 inferred by the compiler but not exposed to the user. An example for this is whether or not an operation is self-adjoint. The type system is leveraged to propagate this information, such that it proliferates, e.g., when a local alias is defined or an array of self-adjoint operations is constructed. It could in principle be exposed to the user in the same way as the information whether an operation is adjointable or controllable is explicitly expressed in source code. The choice of when to allow to explicitly express certain information in source code and when to merely infer them if possible depends on how this would impact the ecosystem of Q# libraries; allowing the user to explicitly require, e.g., operation valued arguments to have certain properties can sig- nificantly impact how effectively the defined operations can be composed. Exposing properties that could be inferred promotes defining several imple- mentations for the same functionality. If the implementations have the same name, then this can lead to a tricky dispatching problem upon optimization, where the compiler needs to make a decision regarding which concrete implementation is the most favorable to invoke for any given call. If on the other hand different implementations for the same functionality have different names, then this on one hand complicates things for a user of said libraries, and on the other hand also promotes hand-coding specialized so- lutions for each problem instance rather than composing general solutions for classes of problem based on existing libraries. In the case of an operation being self-adjoint, the fact that a self-adjoint operation also necessarily is adjointable could potentially be handled well via dispatching to different hand-optimized implementations with the same name, since the choice of when to use which implementation should allows follow the strict hierarchy of preferring and implementation requiring a self-adjoint operation over simply an adjointable one in all cases. However, currently Q# doesn’t yet support type specializations such that for now, that information is merely included in the inferred characteristics of a callable.

The characteristics of an operation are a set of predefined and built- in labels. They are expressed in the form of a special expression that is part of the type signature. The expression consists either of one of the predefined sets of labels, or of a combination of characteristics expressions via a supported binary operator. There are two predefined sets, Adj and Ctl. Adj is the set that contains a single label indicating that an operation is adjointable, and Ctl is the set that contains a single label indicating that an operation is controllable. The two operators that are supported as part of 154 domain-specific language q#

characteristics expressions are the set union ’+’ and the set intersection ’*’. In EBNF,

predefined = "Adj" | "Ctl"; characteristics = predefined | "(", characteristics, ")" | characteristics ("+"|"*") characteristics; As one would expect, ’*’ has higher precedence than ’+’ and both are left- associative. The type of a unitary operation for example is expressed as ( => is Adj + Ctl) where should be replace with the type of the operation argument, and with the type of the returned value.

Indicating the characteristics of an operation in this form has two major advantages; for one, new labels can be introduced without having exponen- tially many language keywords for all combinations of labels. Perhaps more importantly, using expressions to indicate the characteristics of an operation also permits to support parameterizations over operation characteristics in the future. While this is still under active development, the basic idea is to have a placeholder indicating the set of labels. The set intersection then basically allows to impose requirements. Consider for example the following operation:

operation ApplyWith<’TIn>( outerOperation : (’TIn => Unit is Adj), innerOperation : (’TIn => Unit), target : ’TIn ): Unit {

outerOperation(target); innerOperation(target); Adjoint outerOperation(target); } We could have used a conjugation as described in Section 8.4.7 to express the body of the operation, but instead chose to make it more apparent that applying such an operation conditionally on the state of one or more control qubits merely requires applying the inner operation conditionally on their state, since the outer operation will cancel out with its adjoint if the inner operation is not applied. The same holds for constructing the adjoint version. Hence we see that which functors ApplyWith can support entirely 8.6 type system 155 depends on which functors the inner operation supports. Currently, there is not concise way of expressing that, such that the standard libraries in fact contain four different operation called ApplyWith, ApplyWith, ApplyWithC, and ApplyWithAC - one for each combination of labels for the inner oper- ation. We could, however, conceive expressing that as a single operation parametrized over operation characteristics #C, e.g., with the suggested syntax

operation ApplyWith<’TIn, #C>( outerOperation : (’TIn => Unit is Adj), innerOperation : (’TIn => Unit is #C), target : ’TIn ): Unit is #C {

outerOperation(target); innerOperation(target); Adjoint outerOperation(target); }

where #C is a placeholder for a characteristics expression that evaluates to a set of labels. Technically, there is currently no notion of an empty set of labels in Q#, but introducing a way to represent such an empty set for the sake of supporting parametrizing over operation characteristics seems reasonable. The example above doesn’t make it immediately clear why indicating op- eration characteristics as expressions is important to support this. However, consider the case of an operation ApplyMultiControlled. The operation ApplyMultiControlled takes an operation cOp as argument, as well as an array of control qubits and a target qubit. The passed operation cOp takes two qubits as arguments and transforms the second qubit conditional on the state of the first qubit. ApplyMultiControlled then transforms the given target qubit conditional on all control qubits cs being in a |0i state. Whether or not ApplyMultiControlled is adjointable correspondingly de- pends on whether the given operation cOp is adjointable. However, by definition, the operation ApplyMultiControlled is always controllable. This could in the future be expressed in the following form: 156 domain-specific language q#

1 operation ApplyMultiControlled<#C> (

2 cOp : ((Qubit, Qubit) => Unit is #C),

3 cs : Qubit[],

4 target : Qubit

5 ): Unit is Ctl + #C {

6

7 body (...) {

8

9 if (Length(cs) == 0) {

10 fail "need at least one control qubit";

11 }

12 elif (Length(cs) == 1) {

13 cOp(cs[0], target);

14 }

15 else {

16 using (anc = Qubit[Length(cs)-1]) {

17

18 within {

19 for (k in 1 .. Length(anc)-1) {

20 CCNOT(cs[k+1], anc[k-1], anc[k]);

21 }

22 } apply {

23 cOp(Tail(anc), target);

24 }

25 }

26 }

27 }

28

29 controlled (moreCs, ...) {

30 ApplyMultiControlled(cOp, cs+moreCs, target);

31 }

32 }

8.6.6 Type parameterizations

Q# supports type-parameterized operations and functions. Any operation or function declaration may specify one or more type parameters that can be used as the types or part of the types of the callable’s input and/or output. The exception are entry points, which must be concrete and cannot 8.6 type system 157 be type parametrized. Type parameter names start with a tick (’) and may appear multiple times in the input and output types. A type parametrized callable needs to be concretized before it can be assigned or passed as argument, meaning all type parameters need to be resolved to concrete types. A type is considered to be concrete if it is either one of the built-in types, a user defined type, or if it is concrete within the current scope. The following example illustrates what it means for a type to be concrete within the current scope, and is explained in more detail below.

1 function Mapped<’T1, ’T2> (

2 mapper : (’T1 -> ’T2),

3 array : ’T1[]

4 ) : ’T2[] {

5

6 mutable mapped = new ’T2[Length(array)];

7 for (i in IndexRange(array)) {

8 set mapped w/= i <- mapper(array[i]);

9 }

10 return mapped;

11 }

12

13 function AllCControlled<’T3> (

14 ops : (’T3 => Unit)[]

15 ) : ((Bool,’T3) => Unit)[] {

16

17 return Mapped(CControlled<’T3>, ops);

18 }

The function CControlled is defined in the Microsoft.Quantum.Canon namespace. It takes an operation op of type (’TIn => Unit) as argument and returns a new operation of type ((Bool, ’TIn) => Unit) that applies the original operation provided a classical bit (of type Bool) is set to true; this is often referred to as the classically controlled version of op. The function Mapped takes an array of an arbitrary item type ’T1 as argument, applies the given mapper function to each item and returns a new array of type ’T2[] containing the mapped items. It is defined in the Microsoft.Quantum.Array namespace. We intentionally chose to number the type parameters since giving the type parameters in both functions the same name might have made the discussion more confusing. This is not necessary, however; it is perfectly fine to give type parameters for different 158 domain-specific language q#

callables the same name, and the chosen name is only visible and relevant within the definition of that callable. Our new function AllCControlled now takes an array of operations and returns a new array containing the classically controlled versions of these operations. The call in Line 17 resolves the type parameter ’T1 of Mapped to (’T3 => Unit), and the type parameter ’T2 to ((Bool,’T3) => Unit). The resolving type arguments are inferred by the compiler based on the type of the given argument. We say that they are implicitly defined by the argument of the call expression. Type arguments can also be specified explicitly as it is done for CControlled in the same line. The explicit con- cretization CControlled<’T3> is necessary when the type arguments cannot be inferred.

The type ’T3 is concrete within the context of AllCControlled, since it is known for each invocation of AllCControlled. That means that as soon as we know the entry point of our program - which cannot be type parametrized - we know what the concrete type ’T3 is for each call to AllCControlled. We can hence go ahead and generated a suitable implementation for that particular type resolution, similarly to how, e.g., C++ deals with templates. This way, once the entry point to a program is known, all usages of type parameters can be eliminated at compile-time. We refer to this process as monomorphization. It is worth pointing out that there are a couple of restrictions that are needed to ensure that this can indeed be done at compile-time opposed to only at run time. Consider the following example:

1 operation Foo<’TArg> (

2 op : (’TArg => Unit),

3 arg : ’TArg

4 ): Unit {

5

6 let cbit = RandomInt(2) == 0;

7 Foo(CControlled(op), (cbit, arg));

8 }

Any invocation of Foo will, of course, result in an infinite loop, but let’s ignore that for a minute. Foo invokes itself with the classically controlled version of the original operation op that has been passed in as well as a tuple containing a random classical bit in addition to the original argument. For each iteration in the recursion, the type parameter ’TArg of the next 8.6 type system 159 call is resolved to (Bool, ’TArg), where ’TArg is the type parameter of the current call. Concretely, say we invoke Foo with the operation H and an argument arg of type Qubit. Foo will invoke itself with a type argu- ment (Bool, Qubit), which will then invoke Foo with a type argument (Bool,(Bool, Qubit)), and so on. Clearly, we see that in this case, Foo cannot be monomorphized at compile-time, or rather: Any attempt to monomorphize Foo will result in an infinite loop during compilation.

Without imposing additional restrictions, answering whether or not the monomorphization terminates is equivalent to solving the halting problem [227]. Since we want to guarantee that the compilation always terminates, we can make some pessimistic assumptions and generate errors for the cases that might result in such an infinite loop. Looking at the call graph, we can identify cycles that involve only type parametrized callables. Starting at an arbitrary point in such a cycle, we require that the callable chosen as the starting point is invoked with the same set of type arguments after traversing the cycle. With this restriction it is possible to monomorphize all callables and guarantee that this process terminates. It is in fact possible as long as for each callable in the cycle, there is a finite number of cycles after which it is invoked with the original set of type arguments (see the example of the function Bar below). We hence could be less restrictive and detect whether the more precise condition holds. The reason we don’t need to worry about cycles that involve at least one concrete callable without any type parameter is that such a callable will ensure that the type parametrized callables within that cycle are always called with a fixed set of type arguments.

function Bar<’T1,’T2,’T3>(a1:’T1, a2:’T2, a3:’T3) : Unit{ Bar<’T2,’T3,’T1>(a2, a3, a1); }

The Q# standard libraries make heavy use of type parametrized callables to provide a host of useful abstractions, including functions like Mapped and Fold that are familiar from functional languages.

8.6.7 Subtyping and variance

Q# supports only very few conversion mechanisms. Implicit conversions can happen only when applying binary operators, when evaluating condi- tional expressions, and when constructing an array literal. In these cases, 160 domain-specific language q#

a common supertype is determined and the necessary conversions are performed automatically. Aside from such implicit conversions, explicit conversation via function calls are possible and often necessary. At present time, the only subtyping relation that exists applies to opera- tions. Intuitively it makes sense that one should be allowed to substitute an operation that supports more than the required set of functors. Concretely, for any two concrete types TIn and TOut, the subtyping relation is (TIn => TOut) :> (TIn => TOut is Adj), (TIn => TOut is Ctl) :> (TIn => TOut is Adj + Ctl) where A :> B indicates that B is a subtype of A. Phrased differently, B is more restrictive than A such that a value of type B can be used wherever a value of type A is required. If a callable relies on an argument (item) of being of type A, then an argument of type B can safely be substituted since if provides all the necessary capabilities. This kind of polymorphism extends to tuples in that a tuple type B is a subtype of a tuple type A if it contains the same number of items and the type of each item is a subtype of the corresponding item type in A. This is known as depth subtyping. It is worth pointing out that currently, there is no support for width subtyping whatsoever; there is no subtype relation between any two user defined types or a user defined type and any built-in type. The existence of the unwrap operator that allows to extract a tuple containing all named and anonymous items prevents this. It might be worthwhile to consider whether a record-like data type containing named item only would make sense to support this in the future, or whether the introduction of the certain casts opens up interesting options for coercive subtyping.

Looking at callables, we have established that if a callable processes an argument of type A, then it is also capable of processing an argument of type B. If a callable is passed as an argument to another callable, then it has to be capable of processing anything that the type signature requires. This means that if the callable needs to be able to process an argument of type B, any callable that is capable of processing a more general argument of type A can safely be passed. We say that the operation or function type is contravariant in its argument type. Conversely, we expect that if we require that the passed callable returns an a value of type A, then the promise to return a value of type B is sufficient since that value will provide all 8.6 type system 161 necessary capabilities. We say that the operation or function type is covariant in its return type. A :> B hence implies that for any concrete type T1, (B → T1) :> (A → T1), and (T1 → A) :> (T1 → B) where “→” here can mean either a function or operation, and we omit any annotations for characteristics. Substituting A with (B → T2) and (T2 → A) respectively, and substituting B with (A → T2) and (T2 → B) respectively leads to the conclusion that for any concrete type T2, ((A → T2) → T1) :> ((B → T2) → T1), and ((T2 → B) → T1) :> ((T2 → A) → T1), and (T1 → (B → T2)) :> (T1 → (A → T2)), and (T1 → (T2 → A)) :> (T1 → (T2 → B)) By induction, it follows that every additional indirection reverses the vari- ance of the argument type, and leaves the variance of the return type unchanged.

This also makes it clear what the variance behavior of arrays needs to be; retrieving items via an item access operator corresponds to invoking a function of type (Int -> TItem), where TItem is the type of the elements in the array. Since this function is implicitly passed when passing an array, it follows that arrays need to be covariant in their item type. The same considerations also hold for tuples, which are immutable and thus covariant with respect to each item type. If arrays weren’t immutable, the existence of an construct that would allow to set items in an array and thus takes an argument of type TItem would imply that arrays also need to be contravari- ant. The only option for data types that support getting and setting items is hence to be invariant, meaning there is no subtyping relation whatsoever; B[] is not a subtype of A[] even if B is a subtype of A. Purely based on the fact that arrays are immutable in Q#, it follows that they should be covariant. In reality, they are in fact invariant. This is a consequence of the current implementation rather than a strict necessity, and it may be possible to revise that in the future; it is an artifact stemming from the data type that arrays are compiled into, which supports in-place modification for optimization purposes. 162 domain-specific language q#

8.7 debugging and testing

Debugging quantum programs is challenging due to the nature of quantum computation, probabilistic measurements on quantum hardware, and the exponential size of the quantum state space. Q# provides handy tools and functionality to enable detailed debugging of quantum programs. Since Q# functions are deterministic, a function whose output type is a Unit value cannot ever be observed from within a Q# program. That is, a target machine can choose not to execute any function which returns Unit with the guarantee that this omission will not modify the behavior of any following Q# code. This consequence makes functions a useful tool for embedding debugging and testing logic. In particular, functions of this form can be used to represent diagnostic side effects. Consider a simple example:

function AssertPositive(value : Double): Unit { if (value < 0) { fail "Expected a positive number."; } }

The keyword fail indicates the computation should halt, raising an excep- tion in the target machine running the Q# program. By definition, a failure of this kind cannot be observed from within Q#, as no further Q# code is run after a fail-statement is reached. Thus, if we proceed past a call to AssertPositive, we can be assured by the anthropic principle that its input was positive, even though we did not directly observe this fact. Similarly, the function Message has type (String -> Unit), and allows emitting diagnostic messages. As long as the argument to Message is eval- uated, the target machine may elide calls to Message without any conse- quences regarding the correctness of the program. Evaluating the argument is important only in the case where it consists of an operation call, to ensure that any transformation to the quantum state is applied.

Building on these ideas, Q# offers two especially useful assertions: Assert and AssertProb. Assert ensures that measuring the given quantum register in the given Pauli basis produces the expected result. AssertProb asserts that such a measurement produces the expected result with the given probability. 8.7 debugging and testing 163

Target machines which simulate the quantum state are not bound by the no-cloning theorem [noclo], which states that an arbitrary quantum state cannot be copied, and can perform such assertions without. Such a simulator can then, similar to the AssertPositive function above, abort computation if the hypothetical outcome would not be observed in practice.

using (q = Qubit()) { H(q); Assert([PauliX], [q], Zero); // assertionw.r.t. {|+>, |->} }

On actual hardware, where we are constrained by physics, we of course cannot perform such assertions without measurements and thus (poten- tially) disturbing the state. There, Assert and AssertProb do nothing and simply return Unit. 9 INTOTHEFUTURE

Quantum computing is still in its infancy. In many ways, today’s envi- ronment is similar to the state of classical computing in the early 1950s: each system is different, and capabilities and resources are highly lim- ited. The quantum computing community is tiny compared to the broad computing and programming community. There are a small number of algorithms [118] that are relatively well understood, and a smaller number of practical applications. In the future, we expect quantum systems to grow in size and functionality. In the coming decades, systems hopefully will grow millions of physical qubits, enabling the use of error correction and related techniques to present thousands of logical qubits that are sufficiently error-free to make large computations feasible. As the hardware grows, so too will the community and the algorithm corpus.

Quantum computing offers exciting promises regarding how it can benefit society. However, being in the early days of quantum computing we face several major technical challenges throughout the stack that yet need to be overcome in order to fulfill that promise. The needed breakthroughs require a collaborative effort across disciplines and institutions. Beyond taking full advantage of the classical computing resources that we have today to facilitate these developments, we need to leverage the insights into how computing itself has evolved over the course of more than half a century. Computing has grown in ways that would have been impossible to predict back in the early days. Striving to emulate and accelerate a similar progression for quantum computing over the next couple of decades, we have the tremendous advantage of having the hindsight into what the essential success factors were that made that growth possible. The goal of quantum computing is to enable us to achieve what can’t be done by other means of computing. Accomplishing this entails certain requirements on the scale not just of quantum hardware and applications, but also on the field itself. Moreover, this momentous task is a multi-faceted endeavor; it requires pursuing a multitude of ideas and approaches to discover a path that ultimately leads to success. It requires contributions to disparate elements from people with diverse backgrounds, perspec- tives, and areas of expertise. A tight collaboration between people working 164 into the future 165 on hardware, firmware, control software, compilers, and languages is in- evitable. A thriving community promotes the development of compilation techniques tailored to the unique nature and potential of quantum specific concepts. Teaching materials introduce a new generation of researchers to the fundamentals of quantum computing. Making quantum comput- ing succeed demands more than building a stack for executing quantum programs. We need the tools and frameworks that make it possible for in- dividuals to contribute, communicate, and share their knowledge effectively.

The purpose of programming languages is to enable such a communi- cation by representing an idea in a concise way that allows the reader to conceptualize how individual pieces can be composed effectively. We all bring our unique set of skills, insights, perspectives, and ideas. We need those unique assets that each one of us contributes to solve the world’s greatest challenges. If we look at computers today, they in many cases can do far more than a single human can do. Part of the reason for that are modern programming languages. They are the bridges that allow us to share and leverage our collective knowledge. Quantum programming languages are essential to translate ideas into instructions that can be executed by a quantum computer. Not only do they serve as interface between developers and quantum computers, they serve as a facilitator for clearly expressing quantum algorithms in a way that allows to optimize execution, but they are also indispensable to explore and develop applications and the hardware to support them in the first place. They are crucial tools to facilitate the discovery, development, and advancement of quantum algorithms even before hardware exists that is ca- pable of executing them. Suitable tools can help to analyze, understand and mitigate noise in quantum programs using QCVV protocols, and to develop automatic calibration and tuning protocols for quantum devices. We have seen how quantum programming languages play a critical role for applica- tion development by enabling verification, simulation, and visualization of quantum algorithms and programs. Quantum programming languages today are often at the same level as the machine and assembler code that was the norm in classical comput- ing before the introduction of higher-level languages in the mid-1950s. In order to be useful in the future of large-scale quantum computing, they will need to evolve just as classical programming languages did from the early era through today. As quantum technology improves, enabling larger computing devices with increasing capabilities, and as new quantum algo- 166 intothefuture

rithms are being developed, the role of quantum programming languages becomes vital for enabling the programming of quantum computers at scale.

The field of quantum programming languages and compilers is still nascent. Quantum compilers are not yet as sophisticated as modern highly- optimizing classical compilers. Current research and implementation efforts are focused on low-level circuit optimizations, concentrating on the purely quantum pieces rather than on optimizations that act across the entire program structure. This is in contrast to classical compilers where a lot of optimizations are concerned with higher-level abstractions and control flow constructs such as, e.g., various kinds of loops. In quantum computing, it is currently often the case that a more general implementation of an algorithm performs worse than a custom implementation for a specific problem. The specific version can rely on characteristics and properties of the one problem that are not valid in general. This same dichotomy exists in classical high-performance computing and has led to decades of compiler optimization research and engineering, as well as work in language design to give more information to the compiler. While some well-known classical language patterns have been adopted by several quantum languages, more work needs to be done to understand how the quantum specific aspects of a general algorithm can be expressed so that they can be optimized well enough to compete with a custom im- plementation. Today, specialized solutions for each problem instance are often hand-coded and expressed as a sequence of gates that are natively implemented by a targeted processor. More advanced compilation tech- niques and automatic optimizations will permit to write general solutions to entire classes of problems by building layers of abstraction. A quantum programming language must enable developers to build layers of abstrac- tion if they are to write general solutions to classes of problems, rather than hand-coding specialized solutions for each problem instance. Many quantum computing algorithms require some amount of classical computation and control flow interleaved with the quantum operations. For instance, the variational quantum eigensolver algorithm [228] requires significant classical computation after every cycle to compute the rotations to use in the next cycle. Most quantum programming languages do not provide a way to integrate such classical processing into a quantum algo- rithm, but instead rely on a host language to provide that functionality. This makes it harder to develop quantum compiler optimizations that act across into the future 167 the entire program rather than being limited to purely quantum pieces.

Today, quantum algorithms are often expressed in terms of a standard set of operations consisting of Pauli gates and other single-qubit Clifford gates, CNOT, rotations, and measurements. While this vocabulary is certainly sufficient, containing a universal set of gates, it is not clear that it is the best choice of primitives for developing and expressing algorithms. A primary value of a program is as a mechanism for communication between people. A program is a precise expression of an algorithm and can be a clear and effective way to communicate the details of that algorithm to readers. This value is lost if the programming primitives are so low-level as to lose the actual semantics, or if there is insufficient structure and visual distinction to allow the reader to identify the key components of the algorithm and how they fit together. I’d hence like to summarize a few thought about the long-term perspec- tives of quantum programming that may serve as a guiding north star when embarking towards the horizons of quantum computing. horizons of quantum computing

• Growth in hardware scale will make programming individual qubits infeasible. Languages will need to provide developers with abstrac- tions and constructs that allow groups of qubits of variable sizes to be manipulated. In some cases, the sizes may not be known until run time, so the full end-to-end system needs to be able to represent and manipulate collections of unknown sizes. Manipulating qubit collec- tions will require language constructs such as loops and functional map and fold, familiar from classical programming languages.

• Growth in community scale will drive the need for sharable, reusable code libraries. This will radically improve productivity by allowing developers to build on components they or others have built in the past. Perhaps even more important, as the classical developer community has found, sharable code is incredibly effective as a communications and teaching medium. A well-written and testable implementation is the best specification of an algorithm.

• Growth in algorithms will contribute to the need for libraries. It will also put pressure on languages to provide more general and more useful forms of composition than the purely imperative model. It is 168 intothefuture

reasonable to expect that quantum programming languages will follow the evolution of classical languages to richer and richer composition models such as object-oriented programming, functional program- ming, and logic programming, as well as the various hybrids that are common today in classical languages.

The field of quantum applications research is ever-evolving, and it is not obvious which future applications are likely to ultimately benefit society. Only as we gain more insight will it become more evident what optimiza- tions need to be considered the most impactful ones. To advance the field of quantum programming and computation we need to explore all areas discussed in this thesis, and no single software stack is likely to excel in all of them. BIBLIOGRAPHY

1. Spin Glass Server https://software.cs.uni-koeln.de/spinglass/. 2. van Dam, W., Mosca, M. & Vazirani, U. How powerful is adiabatic quan- tum computation? in Proceedings 42nd IEEE Symposium on Foundations of Computer Science 340 (2001), 279. 3. Born, M. & Fock, V. Beweis des Adiabatensatzes. Zeitschrift fur Physik 51, 165 (1928). 4. Kato, T. On the Adiabatic Theorem of Quantum . Journal of the Physical Society of Japan 5, 435 (1950). 5. Aharonov, D., van Dam, W., Kempe, J., Landau, Z., Lloyd, S. & Regev, O. Adiabatic Quantum Computation Is Equivalent to Standard Quantum Computation. SIAM Review 50, 755 (2008). 6. Bunyk, P. I., Hoskinson, E. M., Johnson, M. W., Tolkacheva, E., Al- tomare, F., Berkley, A. J., Harris, R., Hilton, J. P., Lanting, T., Przybysz, A. J. & Whittaker, J. Architectural Considerations in the Design of a Superconducting Quantum Annealing Processor. IEEE Transactions on Applied 24, 1 (2014). 7. Harris, R., Johnson, M. W., Lanting, T., Berkley, A. J., Johansson, J., Bunyk, P., Tolkacheva, E., Ladizinsky, E., Ladizinsky, N., Oh, T., Cioata, F., Perminov, I., Spear, P., Enderud, C., Rich, C., Uchaikin, S., Thom, M. C., Chapple, E. M., Wang, J., Wilson, B., Amin, M. H. S., Dickson, N., Karimi, K., Macready, B., Truncik, C. J. S. & Rose, G. Experimental investigation of an eight-qubit unit cell in a supercon- ducting optimization processor. Phys. Rev. B 82, 024511 (2 2010). 8. Johnson, M. W., Amin, M. H. S., Gildert, S., Lanting, T., Hamze, F., Dickson, N., Harris, R., Berkley, a. J., Johansson, J., Bunyk, P., Chapple, E. M., Enderud, C., Hilton, J. P., Karimi, K., Ladizinsky, E., Ladizinsky, N., Oh, T., Perminov, I., Rich, C., Thom, M. C., Tolkacheva, E., Truncik, C. J. S., Uchaikin, S., Wang, J., Wilson, B. & Rose, G. Quantum annealing with manufactured spins. Nature 473, 194 (2011). 9. Barahona, F. On the computational complexity of Ising spin glass models. Journal of Physics A: Mathematical and General 15, 3241 (1982).

169 170 bibliography

10. Glover, F., Kochenberger, G. & Du, Y. A Tutorial on Formulating and Using QUBO Models. arXiv:1811.11538 [cs.DS] (2018). 11. Lucas, A. Ising formulations of many NP problems. Frontiers in Physics 2 (2014). 12. Farhi, E., Goldstone, J., Gutmann, S., Lapan, J., Lundgren, A. & Preda, D. A Quantum Adiabatic Evolution Algorithm Applied to Random Instances of an NP-Complete Problem. Science 292, 472 (2001). 13. Ray, P., Chakrabarti, B. K. & Chakrabarti, A. Sherrington-Kirkpatrick model in a transverse field: Absence of replica symmetry breaking due to quantum fluctuations. Phys. Rev. B 39, 11828 (1989). 14. Finnila, A., Gomez, M., Sebenik, C., Stenson, C. & Doll, J. Quantum annealing: A new method for minimizing multidimensional functions. Chemical Physics Letters 219, 343 (1994). 15. Kadowaki, T. & Nishimori, H. Quantum annealing in the transverse Ising model. Physical Review E 58, 5355 (1998). 16. Das, A. & Chakrabarti, B. K. Colloquium: Quantum annealing and analog quantum computation. Reviews of Modern Physics 80, 1061 (2008). 17. Zener, C. Non-Adiabatic Crossing of Energy Levels. Proceedings of the Royal Society of London Series A 137, 696 (1932). 18. Jansen, S., Ruskai, M.-B. & Seiler, R. Bounds for the adiabatic ap- proximation with applications to quantum computation. Journal of Mathematical Physics 48, 102111 (2007). 19. Ozfidan, I., Deng, C., Smirnov, A., Lanting, T., Harris, R., Swenson, L., Whittaker, J., Altomare, F., Babcock, M., Baron, C. & et al. Demonstra- tion of a Nonstoquastic Hamiltonian in Coupled Superconducting Flux Qubits. Physical Review Applied 13 (2020). 20. Suzuki, M. Relationship between d-Dimensional Quantal Spin Sys- tems and (d+1)-Dimensional Ising Systems: Equivalence, Critical Exponents and Systematic Approximants of the Partition Function and Spin Correlations. Progress of Theoretical Physics 56, 1454 (1976). 21. Kirkpatrick, S., Gelatt, C. D. & Vecchi, M. P. Optimization by Simu- lated Annealing. Science 220, 671 (1983). 22. Metropolis, N., Rosenbluth, A. W., Rosenbluth, M. N., Teller, A. H. & Teller, E. Equation of State Calculations by Fast Computing Machines. The Journal of Chemical Physics 21, 1087 (1953). bibliography 171

23. Denchev, V. S., Boixo, S., Isakov, S. V., Ding, N., Babbush, R., Smelyan- skiy, V., Martinis, J. & Neven, H. What is the Computational Value of Finite-Range Tunneling? Physical Review X 6 (2016). 24. Jiang, Z., Smelyanskiy, V. N., Isakov, S. V., Boixo, S., Mazzola, G., Troyer, M. & Neven, H. Scaling analysis and instantons for thermally assisted tunneling and quantum Monte Carlo simulations. Physical Review A 95, 12322 (2017). 25. Isakov, S. V., Mazzola, G., Smelyanskiy, V. N., Jiang, Z., Boixo, S., Neven, H. & Troyer, M. Understanding Quantum Tunneling through Quantum Monte Carlo Simulations. Physical Review Letters 117, 180402 (2016). 26. Ying, M. Foundations of Quantum Programming 211 (Morgan Kaufmann, Boston, 2016). 27. Barenco, A., Bennett, C. H., Cleve, R., Divincenzo, D. P., Margolus, N., Shor, P., Sleator, T., Smolin, J. A. & Weinfurter, H. Elementary gates for quantum computation. Physical Review A 52, 3457 (1995). 28. Bergholm, V., Vartiainen, J. J., Möttönen, M. & Salomaa, M. M. Quan- tum circuits with uniformly controlled one-qubit gates. Phys. Rev. A 71, 052330 (5 2005). 29. Kliuchnikov, V., Maslov, D. & Mosca, M. Practical Approximation of Single-Qubit Unitaries by Single-Qubit Quantum Clifford and T Circuits. IEEE Transactions on Computers 65, 161 (2016). 30. Green, A. S., Lumsdaine, P. L., Ross, N. J., Selinger, P. & Valiron, B. Quipper: A Scalable Quantum Programming Language in Proceedings of the 34th ACM SIGPLAN Conference on Programming Language Design and Implementation (Association for Computing Machinery, 2013), 333. 31. Bocharov, A., Roetteler, M. & Svore, K. M. Efficient synthesis of prob- abilistic quantum circuits with fallback. Physical Review A - Atomic, Molecular, and Optical Physics 91, 052317 (2015). 32. Kliuchnikov, V., Maslov, D. & Mosca, M. Fast and efficient exact synthesis of single qubit unitaries generated by Clifford and T gates. Quantum Information & Computation 13, 607 (2013). 33. Ross, N. J. & Selinger, P. Optimal ancilla-free Clifford+T approxi- mation of z-rotations. Quantum Information & Computation 16, 901 (2016). 172 bibliography

34. Kliuchnikov, V., Bocharov, A., Roetteler, M. & Yard, J. A Frame- work for Approximating Qubit Unitaries. arXiv:1510.03888 [quant-ph] (2015). 35. Amy, M., Maslov, D., Mosca, M. & Roetteler, M. A Meet-in-the-Middle Algorithm for Fast Synthesis of Depth-Optimal Quantum Circuits. IEEE Trans. on CAD of Integrated Circuits and Systems 32, 818 (2013). 36. Bravyi, S. & Kitaev, A. Universal quantum computation with ideal Clifford gates and noisy ancillas. Physical Review A 71 (2005). 37. Jozsa, R. An introduction to measurement based quantum computa- tion. arXiv:0508124 [quant-ph] (2005). 38. Gottesman, D. & Chuang, I. L. Demonstrating the viability of uni- versal quantum computation using teleportation and single-qubit operations. Nature 402, 390 (1999). 39. Bocharov, A., Roetteler, M. & Svore, K. M. Efficient Synthesis of Universal Repeat-Until-Success Quantum Circuits. Phys. Rev. Lett. 114, 080502 (8 2015). 40. Bocharov, A., Roetteler, M. & Svore, K. M. Efficient synthesis of probabilistic quantum circuits with fallback. Phys. Rev. A 91, 052317 (5 2015). 41. Wiebe, N. & Roetteler, M. Quantum arithmetic and numerical analysis using Repeat-Until-Success circuits. arXiv:1406.2040 [quant-ph] (2014). 42. Granade, C., Ferrie, C., Wiebe, N. & Cory, D. Robust online Hamilto- nian learning. New Journal of Physics (2012). 43. Paesani, S., Gentile, A. A., Santagati, R., Wang, J., Wiebe, N., Tew, D. P., O’Brien, J. L. & Thompson, M. G. Experimental Bayesian Quantum Phase Estimation on a Silicon Photonic Chip. Physical Review Letters 118, 100503 (2017). 44. Wiebe, N. & Granade, C. Efficient Bayesian Phase Estimation. Physical Review Letters 117 (2016). 45. Kivlichan, I. D., Granade, C. E. & Wiebe, N. Phase Estimation with Randomized Hamiltonians. arXiv:1907.10070 [quant-ph] (2019). 46. Chen, H., Kong, X., Chong, B., Qin, G., Zhou, X., Peng, X. & Du, J. Experimental demonstration of a quantum annealing algorithm for the traveling salesman problem in a nuclear-magnetic-resonance quantum simulator. Phys. Rev. A 83, 032314 (3 2011). bibliography 173

47. Croes, G. A. A Method for Solving Traveling-Salesman Problems. 6, 791 (1958). 48. Martoˇnák,R., Santoro, G. E. & Tosatti, E. Quantum annealing of the traveling-salesman problem. Physical Review E 70, 057701 (2004). 49. Hen, I. & Sarandy, M. S. Driver Hamiltonians for constrained opti- mization in quantum annealing. Phys. Rev. A 93, 062312 (6 2016). 50. IBM ILOG CPLEX Optimizer https : / / www . . com / analytics / cplex-optimizer. 2010. 51. Dash, S. A note on qubo instances defined on chimera graphs. Optima 98, 2 (2015). 52. Barends, R., Shabani, A., Lamata, L., Kelly, J., Mezzacapo, A., Heras, U. L., Babbush, R., Fowler, A. G., Campbell, B., Chen, Y., Chen, Z., Chiaro, B., Dunsworth, A., Jeffrey, E., Lucero, E., Megrant, A., Mutus, J. Y., Neeley, M., Neill, C., O’Malley, P. J. J., Quintana, C., Roushan, P., Sank, D., Vainsencher, A., Wenner, J., White, T. C., Solano, E., Neven, H. & Martinis, J. M. Digitized adiabatic quantum computing with a superconducting circuit. Nature, 222 (2016). 53. Lloyd, S. Universal Quantum Simulators. Science 273, 1073 (1996). 54. Bloom, B. H. Space/Time Trade-Offs in Hash Coding with Allowable Errors. Commun. ACM 13, 422 (1970). 55. Weaver, S. A., Ray, K. J., Marek, V. W., Mayer, A. J. & Walker, A. K. Satisfiability-based set membership filters. Journal on Satisfiability, Boolean Modeling and Computation 8, 129 (2012). 56. Schaefer, T. J. The Complexity of Satisfiability Problems in Proceedings of the Tenth Annual ACM Symposium on Theory of Computing (Association for Computing Machinery, San Diego, California, USA, 1978), 216. 57. Achlioptas, D. The threshold for random k-SAT is 2k log 2 - O(k). Journal of the American Mathematical Society 17, 947 (2004). 58. Douglass, A., King, A. D. & Raymond, J. Constructing SAT Filters with a Quantum Annealer in International Conference on Theory and Applications of Satisfiability Testing (2015), 104. 59. Boixo, S., Rønnow, T. F., Isakov, S. V., Wang, Z., Wecker, D., Lidar, D. A., Martinis, J. M. & Troyer, M. Evidence for quantum annealing with more than one hundred qubits. Nature Physics 10, 218 (2014). 174 bibliography

60. Selman, B., Kautz, H. & Cohen, B. Local Search Strategies for Satis- fiability Testing. Cliques, Coloring, and Satisfiability: Second DIMACS Implementation Challenge 26 (1999). 61. Franco, J. & Paull, M. Probabilistic analysis of the Davis Putnam procedure for solving the satisfiability problem. Discrete Applied Math- ematics 5, 77 (1983). 62. Cook, S. A. The Complexity of Theorem-Proving Procedures in Proceedings of the Third Annual ACM Symposium on Theory of Computing (Associ- ation for Computing Machinery, Shaker Heights, Ohio, USA, 1971), 151. 63. Farhi, E., Goldstone, J. & Gutmann, S. Quantum Adiabatic Evolution Algorithms versus Simulated Annealing. arXiv:0201031 [quant-ph] (2002). 64. Reichardt, B. The quantum adiabatic optimization algorithm and local minima in Conference Proceedings of the Annual ACM Symposium on Theory of Computing (2004), 502. 65. Crosson, E. & Harrow, A. W. Simulated Quantum Annealing Can Be Exponentially Faster Than Classical Simulated Annealing. 2016 IEEE 57th Annual Symposium on Foundations of Computer Science (FOCS) (2016). 66. Matsuda, Y., Nishimori, H. & Katzgraber, H. Quantum annealing for problems with ground-state degeneracy. Journal of Physics: Conference Series 143, 012003 (2009). 67. Mandrà, S., Zhu, Z. & Katzgraber, H. G. Exponentially Biased Ground- State Sampling of Quantum Annealing Machines with Transverse- Field Driving Hamiltonians. Physical Review Letters 118, 70502 (2017). 68. Heim, B., Rønnow, T. F., Isakov, S. V. & Troyer, M. Quantum versus classical annealing of Ising spin glasses. Science 348, 215 (2015). 69. Rieger, H. & Kawashima, N. Application of a continuous time cluster algorithm to the two-dimensional random quantum Ising ferromag- net. European Physical Journal B 9, 233 (1999). 70. Santoro, G. E., Martoˇnák,R., Tosatti, E. & Car, R. Theory of quantum annealing of an Ising spin glass. Science 295, 2427 (2002). 71. Martoˇnák,R., Santoro, G. E. & Tosatti, E. Quantum annealing by the path-integral Monte Carlo method: The two-dimensional random Ising model. Phys. Rev. B 66, 094203 (9 2002). bibliography 175

72. Rønnow, T. F., Wang, Z., Job, J., Boixo, S., Isakov, S. V., Wecker, D., Martinis, J. M., Lidar, D. A. & Troyer, M. Defining and detecting quantum speedup. Science 345, 420 (2014). 73. Katzgraber, H. G., Hamze, F. & Andrist, R. S. Glassy Chimeras Could Be Blind to Quantum Speedup: Designing Better Benchmarks for Quantum Annealing Machines. Phys. Rev. X 4, 021008 (2 2014). 74. Mazzola, G., Smelyanskiy, V. N. & Troyer, M. Quantum Monte Carlo tunneling from quantum chemistry to quantum annealing. Physical Review B 96 (2017). 75. Selman, B. & Kautz, H. Walksat home page https://www.cs.rochester. edu/u/kautz/walksat/. 76. Mertens, S., Mézard, M. & Zecchina, R. Threshold values of random K-SAT from the cavity method. Random Structures and Algorithms 28, 340 (2006). 77. Yu, Y., Subramanyan, P., Tsiskaridze, N. & Malik, S. All-SAT using min- imal blocking clauses in Proceedings of the IEEE International Conference on VLSI Design (2014), 86. 78. Mitchell, D. R., Northwestern State University Natchitoches, L. 7., Adami, C., Keck Graduate Institute Claremont, C. 9., Lue, W., Williams, C. P. & Stanford University Stanford, C. 9. Random matrix model of adiabatic quantum computing. Physical Review. A 71 (2005). 79. Znidaric, M. Scaling of the running time of the quantum adiabatic algorithm for propositional satisfiability. Physical Review. A 71 (2005). 80. Neuhaus, T., Peschina, M., Michielsen, K. & Raedt, H. D. Classical and quantum annealing in the median of three-satisfiability. Physical Review A 83 (2011). 81. Battaglia, D. A., Santoro, G. E. & Tosatti, E. Optimization by quantum annealing: lessons from hard satisfiability problems. Physical Review E 71, 66707 (2005). 82. Krzakala, F., Rosso, A., Semerjian, G. & Zamponi, F. Path-integral representation for quantum spin models: Application to the quantum cavity method and Monte Carlo simulations. Phys. Rev. B 78, 134428 (13 2008). 83. Bricmont, J. & Kupiainen, A. Phase transition in the 3d random field Ising model. Communications in Mathematical Physics 116, 539 (1988). 176 bibliography

84. Zanca, T. & Santoro, G. E. Quantum annealing speedup over simu- lated annealing on random Ising chains. Physical Review B 93, 224431 (2016). 85. Szegedy, M. Quantum speed-up of Markov Chain based algorithms in Pro- ceedings - Annual IEEE Symposium on Foundations of Computer Science, FOCS (2004), 32. 86. Farhi, E., Goldstone, J., Gutmann, S. & Sipser, M. Quantum Compu- tation by Adiabatic Evolution. arXiv:0001106 [quant-ph] (2000). 87. Aharonov, D. & Ta-Shma, A. Adiabatic quantum state generation and statistical zero knowledge in Proceedings of the thirty-fifth ACM symposium on Theory of computing - STOC ’03 (ACM Press, New York, New York, USA, 2003), 20. 88. Boixo, S., Knill, E. & Somma, R. D. Fast quantum algorithms for traversing paths of eigenstates. arXiv:1005.3034 [quant-ph] (2010). 89. Somma, R. D., Boixo, S., Barnum, H. & Knill, E. Quantum simulations of classical annealing processes. Physical Review Letters 101, 130504 (2008). 90. Metropolis, N., Rosenbluth, A. W., Rosenbluth, M. N., Teller, A. H. & Teller, E. Equation of state calculations by fast computing machines. The Journal of Chemical Physics 21, 1087 (1953). 91. Hastings, W. K. Monte Carlo sampling methods using Markov chains and their applications. Biometrika 57, 97 (1970). 92. Kitaev, A. Y. Quantum measurements and the Abelian Stabilizer Problem. arXiv:9511026 [quant-ph] (1995). 93. Temme, K., Osborne, T. J., Vollbrecht, K. G., Poulin, D. & Verstraete, F. Quantum Metropolis sampling. Nature 471, 87 (2011). 94. Yung, M.-H. & Aspuru-Guzik, A. A quantum-quantum Metropolis algorithm. Proceedings of the National Academy of Sciences of the United States of America 109, 754 (2012). 95. Glauber, R. J. Time-dependent of the Ising model. Journal of Mathematical Physics 4, 294 (1963). 96. Vucelja, M. Lifting—A nonreversible Markov chain Monte Carlo algorithm. American Journal of Physics 84, 958 (2016). 97. Barahona, F. On the computational complexity of Ising spin glass models. Journal of Physics A: Mathematical and General 15, 3241 (1982). bibliography 177

98. Ambainis, A. Quantum walk algorithm for element distinctness in Pro- ceedings - Annual IEEE Symposium on Foundations of Computer Science, FOCS (2004), 22. 99. Magniez, F., Nayak, A., Roland, J. & Santha, M. Search via quantum walk. SIAM Journal on Computing 40, 142 (2011). 100. Rudolph, T. & Grover, L. A 2 rebit gate universal for quantum com- puting. arXiv:0210187 [quant-ph] (2002). 101. Pllaha, T., Rengaswamy, N., Tirkkonen, O. & Calderbank, R. Un-Weyl- ing the Clifford Hierarchy. arXiv:2006.14040 [quant-ph] (2020). 102. Low, G. H., Yoder, T. J. & Chuang, I. L. Methodology of resonant equiangular composite quantum gates. Physical Review X 6, 041067 (2016). 103. Low, G. H. & Chuang, I. L. Hamiltonian Simulation by Qubitization. Quantum 3, 163 (2019). 104. Low, G. H. & Chuang, I. L. Optimal Hamiltonian Simulation by Quantum Signal Processing. Physical Review Letters 118, 010501 (2017). 105. Haah, J. Product Decomposition of Periodic Functions in Quantum Signal Processing. Quantum 3, 190 (2019). 106. Kirkpatrick, S., Gelatt, C. D. & Vecchi, M. P. Optimization by simu- lated annealing. Science 220, 671 (1983). 107. Lemieux, J., Duclos-Cianci, G., Sénéchal, D. & Poulin, D. Resource estimate for quantum many-body ground state preparation on a quantum computer. arXiv:2006.04650 [quant-ph] (2020). 108. Marriott, C. & Watrous, J. Quantum Arthur-Merlin games in Computa- tional Complexity 14 (Springer, 2005), 122. 109. Boixo, S., Knill, E. & Somma, R. Eigenpath traversal by phase ran- domization. Quantum Information and Computation 9, 0833 (2009). 110. Janus Collaboration, Belletti, F., Cotallo, M., Cruz, A., Fernández, L. A., Gordillo, A., Guidetti, M., Maiorano, A., Mantovani, F., Mari- nari, E., Martín-Mayor, V., Muñoz-Sudupe, A., Navarro, D., Parisi, G., Pérez-Gaviro, S., Rossi, M., Ruiz-Lorenzo, J. J., Schifano, S. F., Sciretti, D., Tarancón, A., Tripiccione, R. & Velasco, J. L. JANUS: an FPGA- based System for High Performance Scientific Computing. Computing in Science & Engineering 11, 48 (2009). 178 bibliography

111. Janus Collaboration, Baity-Jesi, M., Banos, R. A., Cruz, A., Fernan- dez, L. A., Gil-Narvion, J. M., Gordillo-Guerrero, A., Guidetti, M., Iniguez, D., Maiorano, A., Mantovani, F., Marinari, E., Martin-Mayor, V., Monforte-Garcia, J., Sudupe, A. M., Navarro, D., Parisi, G., Pivanti, M., Perez-Gaviro, S., Ricci-Tersenghi, F., Ruiz-Lorenzo, J. J., Schifano, S. F., Seoane, B., Tarancon, A., Tellez, P., Tripiccione, R. & Yllanes, D. Reconfigurable computing for Monte Carlo simulations: results and prospects of the Janus project. The European Physical Journal Special Topics 210 (2012). 112. Ross, N. J. & Selinger, P. Optimal ancilla-free Clifford+T approxima- tion of Z-rotations. Quantum Information and Computation 16, 0901 (2016). 113. Cody Jones, N., Whitfield, J. D., McMahon, P. L., Yung, M.-H., Meter, R. V., Aspuru-Guzik, A. & Yamamoto, Y. Faster quantum chem- istry simulation on fault-tolerant quantum computers. New Journal of Physics 14, 115023 (2012). 114. Meuli, G., Soeken, M., Roetteler, M. & Häner, T. Automatic accuracy management of quantum programs via (near-) symbolic resource estimation. arXiv:2003.08408 [quant-ph] (2020). 115. Low, G. H., Kliuchnikov, V. & Schaeffer, L. Trading T-gates for dirty qubits in state preparation and unitary synthesis. arXiv:1812.00954 [quant-ph] (2018). 116. Gidney, C. Halving the cost of quantum addition. Quantum 2, 74 (2018). 117. Meuli, G., Soeken, M., Roetteler, M., Bjorner, N. & De Micheli, G. Reversible Pebbling Game for Quantum Memory Management in DATE (IEEE, 2019). 118. Quantum Algorithm Zoo http://quantumalgorithmzoo.org/. 119. Brassard, G., Høyer, P., Mosca, M. & Tapp, A. Quantum amplitude amplification and estimation. Quantum Computation and Information, 53 (2002). 120. Kitaev, A. Quantum measurements and the Abelian Stabilizer Prob- lem. arXiv preprint arXiv:quant-ph/9511026 (1995). 121. Shor, P. Polynomial-Time Algorithms for Prime Factorization and Discrete Logarithms on a Quantum Computer. SIAM Journal on Com- puting 26, 1484 (1997). bibliography 179

122. Miszczak, J. Models of quantum computation and quantum program- ming languages. Bulletin of the Polish Academy of Sciences: Technical Sciences 59, 305 (2011). 123. Wecker, D., Svore, K. M. & Svore, K. M. LIQUi|>: A Architecture and Domain-Specific Language for Quantum Computing 2014. 124. Steiger, D. S., Häner, T. & Troyer, M. ProjectQ: an open source software framework for quantum computing. Quantum 2, 49 (2018). 125. Cross, A. W., Bishop, L. S., Smolin, J. A. & Gambetta, J. M. Open Quantum Assembly Language. arXiv:1707.03429 [quant-ph] (2017). 126. Houck, A. A., Koch, J., Devoret, M. H., Girvin, S. M. & Schoelkopf, R. J. Life after charge noise: recent results with qubits. Quantum Information Processing 8, 105 (2009). 127. Barends, R., Kelly, J., Megrant, A., Sank, D., Jeffrey, E., Chen, Y., Yin, Y., Chiaro, B., Mutus, J., Neill, C. & et al. Coherent Josephson Qubit Suitable for Scalable Quantum Integrated Circuits. Physical Review Letters 111 (2013). 128. Imamoglu,¯ A., Awschalom, D. D., Burkard, G., Divincenzo, D. P., Loss, D., Sherwin, M. & Small, A. Quantum Information Processing Using Quantum Dot Spins and Cavity QED. Phys. Rev. Lett. 83, 4204 (1999). 129. Cirac, J. I. & Zoller, P. Quantum Computations with Cold Trapped Ions. Phys. Rev. Lett. 74, 4091 (20 1995). 130. Nayak, C., Simon, S. H., Stern, A., Freedman, M. & Das Sarma, S. Non-Abelian anyons and topological quantum computation. Reviews of Modern Physics 80, 1083 (2008). 131. Preskill, J. Quantum Computing in the NISQ era and beyond. Quan- tum 2, 79 (2018). 132. Smith, R. S., Curtis, M. J. & Zeng, W. J. A Practical Quantum Instruc- tion Set Architecture. arXiv:1608.03355 [quant-ph] (2016). 133. Svore, K., Roetteler, M., Geller, A., Troyer, M., Azariah, J., Granade, C., Heim, B., Kliuchnikov, V., Mykhailova, M. & Paz, A. Q#: Enabling scal- able quantum computing and development with a high-level domain-specific language in Proceedings of the Real World Domain Specific Languages Workshop (ACM Press, 2018). 134. Svore, K. M., Aho, A. V., Cross, A. W., Chuang, I. & Markov, I. L. A Layered Software Architecture for Quantum Computing Design Tools. Computer 39, 74 (2006). 180 bibliography

135. Cross, A. W., Bishop, L. S., Smolin, J. A. & Gambetta, J. M. Open Quantum Assembly Language. arXiv:1707.03429 [quant-ph] (2017). 136. Cirq Documentation https://cirq.readthedocs.io/en/stable/. 137. Häner, T. & Steiger, D. S. 0.5 petabyte simulation of a 45-qubit quantum circuit in Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (ACM, 2017). 138. Kornyik, M. & Vukics, A. The Monte Carlo wave-function method: A robust adaptive algorithm and a study in convergence. Computer Physics Communications 238, 88 (2019). 139. Aaronson, S. & Gottesman, D. Improved Simulation of Stabilizer Circuits. Physical Review A 70, 052328 (2004). 140. Steiger, D. S., Häner, T. & Troyer, M. ProjectQ: An Open Source Soft- ware Framework for Quantum Computing. arXiv:1612.08091 [quant- ph] (2016). 141. Reiher, M., Wiebe, N., Svore, K. M., Wecker, D. & Troyer, M. Elucidat- ing Reaction Mechanisms on Quantum Computers. Proceedings of the National Academy of Sciences, 201619152 (2017). 142. Paykin, J., Rand, R. & Zdancewic, S. QWIRE: A Core Language for Quantum Circuits in Proceedings of the 44th ACM SIGPLAN Symposium on Principles of Programming Languages (Association for Computing Machinery, Paris, France, 2017), 846. 143. Rand, R., Paykin, J. & Zdancewic, S. QWIRE Practice: Formal Veri- fication of Quantum Circuits in Coq in Proceedings 14th International Conference on Quantum Physics and Logic, QPL 2017 266 (2017), 119. 144. Shi, Y., Li, X., Tao, R., Javadi-Abhari, A., Cross, A. W., Chong, F. T. & Gu, R. Contract-based verification of a realistic quantum compiler. arXiv:1908.08963 [quant-ph] (2019). 145. Ying, M. Toward automatic verification of quantum programs. Formal Aspects of Computing 31, 3 (2019). 146. Microsoft Quantum Documentation https://docs.microsoft.com/ quantum. 147. Qiskit Documentation https://qiskit.org/documentation/. bibliography 181

148. Kluyver, T., Ragan-Kelley, B., Pérez, F., Granger, B. E., Bussonnier, M., Frederic, J., Kelley, K., Hamrick, J. B., Grout, J., Corlay, S., Ivanov, P., Avila, D., Abdalla, S., Willing, C., et al. Jupyter Notebooks - a publishing format for reproducible computational workflows in Positioning and Power in Academic Publishing: Players, Agents and Agendas, 20th International Conference on Electronic Publishing (2016), 87. 149. Thomas, D. Code kata: How to become a better developer http : / / codekata.com/. 2007. 150. IBM Quantum Experience https://quantum-computing.ibm.com/. 151. Experience quantum impact with Azure Quantum https://cloudblogs. microsoft . com / quantum / 2019 / 11 / 04 / announcing - microsoft - azure-quantum/. 2019. 152. Ho, A. & Bacon, D. Announcing Cirq: An Open Source Framework for NISQ Algorithms https://ai.googleblog.com/2018/07/announcing- cirq-open-source-framework.html. Google AI blog entry from Jul 18, 2018. 2018. 153. Rigetti Quantum Cloud Services https : / / medium . com / rigetti / introducing-rigetti-quantum-cloud-services-c6005729768c. 154. Alpine Quantum Technologies (AQT) https://www.aqt.eu/solutions/. 155. QuTech. Quantum Inspire Home https://www.quantum-inspire.com/. 2018. 156. Amazon Braket https://aws.amazon.com/blogs/aws/amazon-braket- get-started-with-quantum-computing/. 157. Arute, F. et al. Quantum supremacy using a programmable supercon- ducting processor. Nature 574, 505 (2019). 158. Kandala, A., Temme, K., Córcoles, A. D., Mezzacapo, A., Chow, J. M. & Gambetta, J. M. Error mitigation extends the computational reach of a noisy quantum processor. Nature 567, 491 (2019). 159. Montanaro, A. Quantum algorithms: an overview. npj Quantum Infor- mation 2 (2016). 160. Roetteler, M. & Svore, K. M. Quantum Computing: Codebreaking and Beyond. IEEE Security Privacy 16, 22 (2018). 161. Montanaro, A. Quantum Speedup of Branch-and-Bound Algorithms. Physical Review Research 2, 013056 (2020). 182 bibliography

162. Chong, F. T., Franklin, D. & Martonosi, M. Programming languages and compiler design for realistic quantum hardware. Nature 549, 180 (2017). 163. Ross, J. The Dawn of Quantum Programming. Quantum Views 2, 4 (2018). 164. Nam, Y., Ross, N. J., Su, Y., Childs, A. M. & Maslov, D. Automated optimization of large quantum circuits with continuous parameters. npj Quantum Information 4 (2018). 165. Farhi, E., Goldstone, J. & Gutmann, S. A Quantum Approximate Optimization Algorithm. arXiv:1411.4028 [quant-ph] (2014). 166. Peruzzo, A., McClean, J. R., Shadbolt, P., Yung, M.-H., Zhou, X.-Q., Love, P., Aspuru-Guzik, A. & O’Brien, J. L. A variational eigenvalue solver on a photonic quantum processor. Nature Communications 5, 4213 (2014). 167. Moll, N., Barkoutsos, P., Bishop, L. S., Chow, J. M., Cross, A., Egger, D. J., Filipp, S., Fuhrer, A., Gambetta, J. M., Ganzhorn, M., Kandala, A., Mezzacapo, A., Müller, P., Riess, W., Salis, G., Smolin, J., Tavernelli, I. & Temme, K. Quantum optimization using variational algorithms on near-term quantum devices. Quantum Science and Technology 3, 030503 (2018). 168. Hoare, T. Null References: The Billion Dollar Mistake https : / / www . infoq.com/presentations/Null-References-The-Billion-Dollar- Mistake-Tony-Hoare/. 2009. 169. Iverson, K. E. Notation as a Tool of Thought. Commun. ACM 23, 444 (1980). 170. LaRose, R. Overview and comparison of gate level quantum software platforms. Quantum 3, 130. 171. Mosca, M., Roetteler, M. & Selinger, P. Quantum Programming Lan- guages (Dagstuhl Seminar 18381). Dagstuhl Reports 8, 112 (2018). 172. Kliuchnikov, V. Wrong QASM output for Teleportation circuit https: //github.com/epiqc/ScaffCC/issues/28. 2018. 173. Javadi-Abhari, A., Gokhale, P., Holmes, A., Franklin, D., Brown, K. R., Martonosi, M. & Chong, F. T. Optimized Surface Code Communication in Superconducting Quantum Computers in Proceedings of the 50th Annual IEEE/ACM International Symposium on Microarchitecture (Association for Computing Machinery, Cambridge, Massachusetts, 2017), 692. bibliography 183

174. Scaffold GitHub repository https://github.com/epiqc/ScaffCC. 175. The Quipper System https : / / www . mathstat . dal . ca / ~selinger / quipper/doc/. 176. Abhari, A. J., Faruque, A., Dousti, M. J., Svec, L., Catu, O., Chakrabati, A., Chiang, C.-F., Vanderwilt, S., Black, J., Chong, F., Martonosi, M., Suchara, M., Brown, K., Pedram, M. & Brun, T. Scaffold: Quantum Programming Language tech. rep. TR-934-12 (Princeton University, 2012). 177. Abhari, A. J., Holmes, A., Patil, S., Heckey, J., Kudrow, D., Gokhale, P., Noursi, D., Ehudin, L., Ding, Y., Wu, X.-C. R. & Shi, Y. ScaffCC User Manual 2018. 178. Bennett, C. H., Brassard, G., Crépeau, C., Jozsa, R., Peres, A. & Woot- ters, W. K. Teleporting an unknown quantum state via dual classical and Einstein–Podolsky–Rosen channels. Physical Review Letters 70, 1895. 179. Brassard, G., Braunstein, S. & Cleve, R. Teleportation as a quantum computation. Physica D 1–2, 43. 180. PyQuil GitHub repository https://github.com/rigetti/pyquil. 181. ProjectQ GitHub repository https://github.com/ProjectQ-Framework/ ProjectQ. 182. QWIRE GitHub repository https://github.com/inQWIRE/QWIRE. 183. Amy, M. & Gheorghiu, V. staq – a full-stack quantum processing toolkit. arXiv:1912.06070 [quant-ph] (2019). 184. Staq GitHub repository https://github.com/softwareQinc/staq. 185. Killoran, N., Izaac, J., Quesada, N., Bergholm, V., Amy, M. & Weed- brook, C. Strawberry Fields: A Software Platform for Photonic Quan- tum Computing. Quantum 3, 129 (2019). 186. Strawberry Fields GitHub repository https://github.com/XanaduAI/ strawberryfields. 187. Sivarajah, S., Dilkes, S., Cowtan, A., Simmons, W., Edgington, A. & Duncan, R. t|ket>: A retargetable compiler for NISQ devices. Quan- tum Science and Technology (2020). 188. t|ket> GitHub repository https://github.com/CQCL/pytket. 184 bibliography

189. McCaskey, A. J., Lyakh, D. I., Dumitrescu, E. F., Powers, S. S. & Humble, T. S. XACC: a system-level software infrastructure for het- erogeneous quantum-classical computing. arXiv:1911.02452 [quant-ph] (2019). 190. XACC GitHub repository https://github.com/eclipse/xacc. 191. QuTiP Documentation http://qutip.org/documentation.html. 192. Johansson, J., Nation, P. & Nori, F. QuTiP: An open-source Python framework for the dynamics of open quantum systems. Computer Physics Communications 183, 1760 (2012). 193. Johansson, J., Nation, P. & Nori, F. QuTiP 2: A Python framework for the dynamics of open quantum systems. Computer Physics Communi- cations 184, 1234 (2013). 194. Q# 0.6: Language Features and More https://devblogs.microsoft. com/qsharp/qsharp-06-language-features-and-more/. 2019. 195. Why do we need Q#? https://devblogs.microsoft.com/qsharp/why- do-we-need-q/. 2018. 196. Rios, F. & Selinger, P. A categorical model for a quantum circuit description language in Proceedings 14th International Conference on Quantum Physics and Logic, QPL 2017 (2017), 164. 197. What are Qubits? https://devblogs.microsoft.com/qsharp/what- are-qubits/. 2019. 198. Official Page for Language Server Protocol https://microsoft.github. io/language-server-protocol/. 199. Mykhailova, M. & Svore, K. M. Teaching Quantum Computing through a Practical Software-Driven Approach: Experience Report in Proceedings of the 51st ACM Technical Symposium on Computer Science Education (Association for Computing Machinery, Portland, OR, USA, 2020), 1019. 200. Abraham, H. et al. Qiskit: An Open-source Framework for Quantum Computing https://zenodo.org/record/2562110. 2019. 201. Cross, A. The IBM Q experience and QISKit open-source quantum computing software. Bulletin of the American Physical Society 63 (2018). 202. Qiskit/Qiskit-Aqt-Provider https://github.com/Qiskit/qiskit-aqt- provider. 2020. 203. Qiskit/Qiskit-Honeywell-Provider https://github.com/Qiskit/qiskit- honeywell-provider. 2020. bibliography 185

204. Asfaw, A., Bello, L., Ben-Haim, Y., Bravyi, S., Capelluto, L., Vazquez, A. C., Ceroni, J., Harkins, F., Gambetta, J., Garion, S., Gil, L., Gonzalez, S. D. L. P., McKay, D., Minev, Z., Nation, P., Phan, A., Rattew, A., Schaefer, J., Shabani, J., Smolin, J., Temme, K., Tod, M. & Wootton., J. Learn Quantum Computation Using Qiskit (2020). 205. Jupyter/Jupyter-Book https://github.com/jupyter/jupyter- book. Library Catalog: github.com. 206. Paetznick, A. & Svore, K. M. Repeat-Until-Success: Non-deterministic decomposition of single-qubit unitaries. Quantum Information and Computation 14, 1277 (2014). 207. Broughton, M., Verdon, G., McCourt, T., Martinez, A. J., Yoo, J. H., Isakov, S. V., Massey, P., Niu, M. Y., Halavati, R., Peters, E., Leib, M., Skolik, A., Streif, M., Dollen, D. V., McClean, J. R., Boixo, S., Bacon, D., Ho, A. K., Neven, H. & Mohseni, M. TensorFlow Quantum: A Soft- ware Framework for Quantum Machine Learning. arXiv:2003.02989 [quant-ph] (2020). 208. McClean, J. R., Kivlichan, I. D., Sung, K. J., Steiger, D. S., Cao, Y., Dai, C., Fried, E. S., Gidney, C., Gimby, B., Gokhale, P., Häner, T., Hardikar, T., Havlíˇcek,V., Huang, C., Izaac, J., Jiang, Z., Liu, X., Neeley, M., O’Brien, T., Ozfidan, I., Radin, M. D., Romero, J., Rubin, N., Sawaya, N. P. D., Setia, K., Sim, S., Steudtner, M., Sun, Q., Sun, W., Zhang, F. & Babbush, R. OpenFermion: The Package for Quantum Computers. arXiv:1710.07629 [physics, quant-ph] (2017). 209. Abadi, M., Barham, P., Chen, J., Chen, Z., Davis, A., Dean, J., Devin, M., Ghemawat, S., Irving, G., Isard, M., Kudlur, M., Levenberg, J., Monga, R., Moore, S., Murray, D. G., Steiner, B., Tucker, P., Vasudevan, V., Warden, P., Wicke, M., Yu, Y. & Zheng, X. TensorFlow: A System for Large-Scale Machine Learning in 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16) (2016), 265. 210. Cirq source code remarks https : / / github . com / quantumlib / Cirq / blob/master/cirq/google/engine/engine.py. 211. Smith, J. M., Ross, N. J., Selinger, P. & Valiron, B. Quipper: Concrete Resource Estimation in Quantum Algorithms. arXiv:1412.0625 [quant- ph] (2014). 212. Anticoli, L., Piazza, C., Taglialegne, L. & Zuliani, P. Verifying Quan- tum Programs: From Quipper to QPMC. arXiv:1708.06312 [quant-ph] (2017). 186 bibliography

213. Mahmoud, M. Y. & Felty, A. P. Formalization of Metatheory of the Quipper Quantum Programming Language in a Linear Logic. Journal of Automated Reasoning 63, 967 (2019). 214. Ross, N. J. Algebraic and Logical Methods in Quantum Computation. arXiv:1510.02198 [quant-ph] (2015). 215. Fu, P., Kishida, K., Ross, N. J. & Selinger, P. A tutorial introduction to quantum circuit programming in dependently typed Proto-Quipper. arXiv:2005.08396 [cs.PL] (2020). 216. Childs, A. M., Maslov, D., Nam, Y. S., Ross, N. J. & Su, Y. Toward the first quantum simulation with quantum speedup in Proceedings of the National Academy of Sciences of the United States of America 115 (2018), 9456. 217. Abhari, A. J., Patil, S., Kudrow, D., Heckey, J., Lvov, A., Chong, F. T. & Martonosi, M. ScaffCC: Scalable compilation and analysis of quantum programs. 45, 2 (2015). 218. Soeken, M., Frehse, S., Wille, R. & Drechsler, R. RevKit: A Toolkit for Reversible Circuit Design. Multiple-Valued Logic and Soft Computing 18, 55 (2012). 219. Khammassi, N. QX Quantum Computer Simulator http://www.quantum- studio.net/. 220. Svore, K. M., Hastings, M. & Freedman, M. Faster Phase Estimation. Quantum Information and Computation 14, 306 (2013). 221. Kimmel, S., Low, G. H. & Yoder, T. J. Robust calibration of a universal single-qubit gate set via robust phase estimation. Physical Review A 92 (2015). 222. Reiher, M., Wiebe, N., Svore, K. M., Wecker, D. & Troyer, M. Elucidating reaction mechanisms on quantum computers in Proceedings of the National Academy of Sciences 114 (National Academy of Sciences, 2017), 7555. 223. Coppersmith, D. An approximate Fourier transform useful in quan- tum factoring. arXiv:0201067 [quant-ph] (2002). 224. Roetteler, M. & Beth, T. Representation-theoretical Properties of the Approximate Quantum Fourier Transform. Applicable Algebra in Engi- neering, Communication and Computing 3, 177 (2008). 225. IonQ hardware provider https://ionq.com/. 226. Honeywell Quantum Solutions https://www.honeywell.com/en-us/ company/quantum. bibliography 187

227. Turing, A. M. On Computable Numbers, with an Application to the Entscheidungsproblem. Proceedings of The London Mathematical Society 41, 230 (1937). 228. Peruzzo, A., McClean, J., Shadbolt, P., Yung, M.-H., Zhou, X.-Q., Love, P. J., Aspuru-Guzik, A. & O’Brien, J. L. A Variational Eigenvalue Solver on a Photonic Quantum Processor. Nature Communications 5, 1 (2014). CURRICULUMVITAE personal data

Name Bettina Heim Date of Birth July 2, 1989 Place of Birth Herisau, Switzerland Citizen of Switzerland education

Fall 2016 – Doctoral studies, Summer 2020 ETH Zurich Summer 2014 – Master’s degree, Summer 2016 ETH Zurich Summer 2011 – Bachelor’s degree, Summer 2014 ETH Zurich employment

Winter 2019 – Senior Software Engineering Manager Summer 2020 Microsoft Summer 2018 – Senior Software Development Engineer Winter 2019 Microsoft Spring 2017 – Research Software Development Engineer Summer 2018 Microsoft Spring 2015 – Research Assistant Fall 2016 ETH Zurich Spring 2013 – Teaching Assistant Fall 2014 ETH Zurich

188 SELECTEDCONFERENCEENGAGEMENTS program committees

Committee member. International Conference on Compiler Construction (2020). Committee member. International Workshop on Programming Languages for Quantum Computing (2020). Committee member. International Symposium on Code Generation and Optimization (2019). keynotes

Compiler and Language Design for Quantum Computing. International Conference on Compiler Construction (2018). Quantum Computing - Vision and Reality. International Conference on Parallel Architectures and Compilation Techniques (2018). invited talks

Q# - Going Beyond Quantum Circuits. Programming Languages for Quantum Computing (2020). A Software Stack for Quantum Computing. Monte Verita Conference on Quantum Systems and Technology (2018). Leveraging the Power of Quantum for Machine Learning. Beyond Digital Computing - The Power of Quantum and Neural Networks (2018). Performance Assessment of Quantum Annealing Algorithms - Solv- ing the Traveling Salesman Problem. Aspen Conference on Advances in Quantum Algorithms and Computation (2016).

189 190 bibliography

workshops

Introduction to Quantum Computing. Grace Hopper Conference (2018). Toolchains for Quantum Computing. CCC workshop on the Next Steps in Quantum Computing: Computer Science’s Role (2018). Surface Code Thresholds under Optimal Decoding. Workshop on Quan- tum Algorithms and Devices (2017). Beyond the Reach of Classical Performance. IARPA workshop at the IEEE International Conference on Rebooting Computing (2016). Classical versus Quantum Annealing - A Numerical Study on Ising Spin Glasses. Workshop on Classical and Quantum Optimization (2014). PUBLICATIONS

I. Heim, B., Soeken, M., Marshall, S., Granade, C., Roetteler, M., Geller, A., Troyer, M. & Svore, K. A Review of Quantum Programming Languages. (currently in review by Nature Review Physics) (2020). II. Lemieux, J., Heim, B., Poulin, D., Svore, K. & Troyer, M. Efficient Quantum Walk Circuits for Metropolis-Hastings Algorithm. Quantum 4, 287 (2020). III. Svore, K., Roetteler, M., Geller, A., Troyer, M., Azariah, J., Granade, C., Heim, B., Kliuchnikov, V., Mykhailova, M. & Paz, A. Q#: Enabling scal- able quantum computing and development with a high-level domain-specific language in Proceedings of the Real World Domain Specific Languages Workshop (ACM Press, 2018). IV. Heim, B., Brown, E. W., Wecker, D. & Troyer, M. Designing Adiabatic Quantum Optimization: A Case Study for the Traveling Salesman Problem. arXiv:1702.06248 [quant-ph] (2017). V. Azinovi´c,M., Herr, D., Heim, B., Brown, E. & Troyer, M. Assessment of Quantum Annealing for the Construction of Satisfiability Filters. SciPost Physics 2 (2017). VI. Herr, D., Brown, E., Heim, B., Könz, M., Mazzola, G. & Troyer, M. Op- timizing Schedules for Quantum Annealing. arXiv:1705.00420 [quant- ph] (2017). VII. Heim, B., Svore, K. M. & Hastings, M. B. Optimal Circuit-Level Decoding for Surface Codes. arXiv:1609.06373 [quant-ph] (2016). VIII. Steiger, D. S., Heim, B., Rønnow, T. F. & Troyer, M. Performance of quantum annealing hardware in Electro-Optical and Infrared Systems: Technology and Applications XII; and Quantum Information Science and Technology (eds Huckridge, D. A., Ebert, R., Gruneisen, M. T., Dusek, M. & Rarity, J. G.) 9648 (SPIE, 2015), 274. IX. Heim, B., Ronnow, T. F., Isakov, S. V. & Troyer, M. Quantum versus classical annealing of Ising spin glasses. Science 348, 215 (2015).

191