ProjectQ: An Open Source Software Framework for

Damian S. Steiger, Thomas Häner, and Matthias Troyer

Institute for Theoretical Physics, ETH Zurich, 8093 Zurich, Switzerland January 31, 2018

We introduce ProjectQ, an open source The open source effort ProjectQ aims to im- software effort for quantum computing. prove the development of practical quantum com- The first release features a compiler puting in three key areas. First, the develop- framework capable of targeting various ment of new quantum algorithms is accelerated types of hardware, a high-performance by allowing to implement them in a high-level simulator with emulation capabilities, language prior to testing them on efficient high- and compiler plug-ins for circuit draw- performance simulators and emulators. Second, ing and resource estimation. We in- the modular and extensible design encourages troduce our Python-embedded domain- the development of improved compilation, opti- specific language, present the features, mization, gate synthesis and layout modules by and provide example implementations for quantum computer scientists, since these indi- quantum algorithms. The framework vidual components can easily be integrated into allows testing of quantum algorithms ProjectQ’s full stack framework, which provides through simulation and enables running tools for testing, debugging, and running quan- them on actual quantum hardware using tum algorithms. Finally, the back-ends to actual a back-end connecting to the IBM Quan- quantum hardware – either open cloud services tum Experience cloud service. Through like the IBM Quantum Experience [1] or propri- extension mechanisms, users can provide etary hardware – allow the execution of quantum back-ends to further quantum hardware, algorithms on changing quantum computer test and scientists working on quantum compi- beds and prototypes. Compiling high-level quan- lation can provide plug-ins for additional tum algorithms to quantum hardware will facili- compilation, optimization, gate synthesis, tate hardware-software co-design by giving feed- and layout strategies. back on the performance of algorithms: theorists can adapt their algorithms to perform better on quantum hardware and experimentalists can tune 1 Introduction their next generation devices to better support common quantum algorithmic primitives. Quantum computers are a promising candidate for a technology which is capable of reaching We propose to use a device independent high- beyond exascale performance. There has been level language with an intuitive syntax and a

arXiv:1612.08091v2 [quant-ph] 29 Jan 2018 tremendous progress in recent years and we soon modular compiler design, as discussed in Ref. [2]. expect to see quantum computing test beds with The quantum compiler then transforms the high- tens and hopefully soon hundreds or even thou- level language to hardware instructions, optimiz- sands of . As these test devices get larger, ing over all the different intermediate represen- a full software stack for quantum computing is tations of the quantum program, as depicted in required in order to accelerate the development Fig.1. Programming quantum algorithms at a of quantum software and hardware, and to lift higher level of abstraction results in faster de- the programming of quantum computers from velopment thereof, while automatic compilation specifying individual quantum gates to describ- to low-level instruction sets allows users to com- ing quantum algorithms at higher levels of ab- pile their algorithms to any available back-end straction. by merely changing one line of code. This in-

Accepted in Quantum 2018-01-21, click title to verify 1 "#$%&'()$&*+&,-./(+ certain compiler components prove to be bottle- qureg = eng.allocate_qureg(3) Entangle | qureg necks, they can be moved to a compiled language Measure | qureg such as C++ using, e.g., pybind [3]. Thus, Pro-

!*7/23$4$3 jectQ is able to support both near-term testbeds '(89*36.*(+

0(123$4$3&*+#.%5'.*(+# and future large-scale quantum computers. ! ! As a back-end, ProjectQ integrates a quantum

! ! emulator, as first introduced in Ref. [4], allowing to simulate quantum algorithms by taking clas- ! sical shortcuts and hence obtaining speedups of

0(123$4$3 several orders of magnitude. For the simulation '(89*36.*(+ !6%)16%$&*+#.%5'.*(+# at a low level, we include a new simulator which outperforms all other available simulators, includ- ing its predecessor in Ref. [4]. Furthermore, our compiler has been tested with actual hardware and one of our back-ends allows to run quantum algorithms on the IBM Quantum Experience. Figure 1: High-level picture of what a compiler does: It Related work. Several transforms the high-level user code to gate sequences languages, simulators, and compilers have been satisfying the constraints dictated by the target proposed and implemented in various languages. hardware (supported gate set, connectivity, ...) while optimizing the circuit. The resulting low-level Yet, only a few of them are freely available: instructions are then translated to, e.g., pulse Quipper [5], a quantum program compiler im- sequences. plemented in Haskell, the ScaffCC compiler [6] based on the LLVM framework, and the LIQUi |i simulator, which is implemented in F# [7] and cludes not only the different hardware platforms, only available as a binary. While all of these but also simulators, emulators, and resource es- tools are important for the development of quan- timators, which can be used for testing, debug- tum computing in their own right, they have not ging, and benchmarking algorithms. Moreover, (yet) been developed to a complete and unified our modular compiler approach allows fast adap- software stack such as ProjectQ, containing the tation to new hardware specifications in order to means to compile, simulate, emulate, and, ulti- support all technologies currently being de- mately, run quantum algorithms on actual hard- veloped. ware. Our high-level quantum language is imple- Outline. After introducing the ProjectQ frame- mented as a domain-specific language embedded work in Sec.2, we motivate the methodology be- in Python; see the following code for an example: hind it in Sec.3 using Shor’s algorithm for factor- ing as an example. We then introduce the main def AddConstant(eng, quint, c): features of our framework, including the high- with Compute(eng): level language, the compiler, and various back- QFT | quint ends in Sec.4. Finally, we provide the road map # addition in the phases: for future extensions of the ProjectQ framework phi_add(quint, c) in Sec.5, which includes quantum chemistry and math libraries, support for further hardware and Uncompute(eng) software back-ends, and additional compiler com- ponents. To enable fast prototyping and future exten- sions, the compiler is also implemented in Python and makes use of the novel meta functions intro- 2 The ProjectQ Framework duced in Ref. [2] in order to produce more efficient code. Having the entire compiler implemented ProjectQ is an extensible open source software in Python is sufficient and preferred for current framework for quantum computing, providing and near-term quantum test beds, as Python is clean interfaces for extending and improving its widely used and allows for fast prototyping. If components. ProjectQ is built on four core prin-

Accepted in Quantum 2018-01-21, click title to verify 2 ciples: open & free, simple learning curve, easily extensible, and high code quality. QFT Φadd QFT† QFT Φadd QFT† Open & free: To encourage wide use, ProjectQ is being released as an open source software un- Figure 2: Circuit for carrying out two controlled Fourier der the Apache 2 license. This is one of the most transform additions in sequence: An optimizer can permissive license models and allows, for exam- identify the QFT with its inverse, allowing to cancel ple, free commercial use. those two operations. This identification is much Simple learning curve: ProjectQ is imple- harder or even impossible after decomposing operations mented in Python (supporting both versions 2 and synthesizing rotation gates. Furthermore, since QFT†QFT = 1, the quantum Fourier transform does and 3) because of its simple learning curve. not need to be controlled, which can be achieved using Python is already widely used in the quantum our Compute/Uncompute meta-instructions (see community and easier to learn than C++ or func- Sec. 4.1.3). tional programming languages. We make use of high-performance code written in C++ for some of At the heart of Shor’s algorithm for fac- the computational high performance kernels, but toring [9] lies modular exponentiation (of a hide them behind a Python interface. classically-known constant by a quantum me- Easily extensible: ProjectQ is easily extensible chanical value) which is well-known to be imple- due to the modular implementation of both com- mentable using constant-adders. Given a classical piler and back-ends. This allows users to easily constant c, they transform a quantum register |xi adapt the compiler to support new gates or en- representing the integer x as tirely different gate sets as will be shown in a later section. |xi 7→ |x + ci . High code quality: ProjectQ’s code base fol- Out of several implementations [11, 12, 13], we lows high industry standards, including manda- focus on Draper’s addition in Fourier space [13], tory code-reviews, continuous integration testing due to the large potential for optimization when (currently 99% line coverage using unit tests in executing several additions in sequence, which addition to functional tests), and an extensive happens when using the construction in [10] to code documentation. achieve modular addition. This constant-adder The initial release of ProjectQ, which is de- works as follows: scribed in this paper, implements powerful core 1. Apply a quantum Fourier transform (QFT) to functionalities, and we will continue to add new |xi. features on a regular basis. We encourage con- tributions to the ProjectQ framework from any- 2. Apply phase-shift gates to the qubits of |xi, one: Implementations of quantum algorithms, depending on the bit-representation of the new compiler engines, and interfaces to actual constant c to add (i.e., the addition is carried quantum hardware back-ends are welcome. For out in the phases). more information about contributing see Ref. [8]. 3. Apply an inverse QFT. Modular exponentiation can be implemented us- 3 Quantum Programs and Compila- ing repeated modular multiplication and shift, which themselves can be built using controlled tion modular adders. Thus, at a lower level, (double- or single-) controlled constant-adders are per- We motivate the methodology behind ProjectQ formed. When executing two such controlled ad- by presenting an implementation of Shor’s factor- ditions in sequence, the resulting circuit can be ing algorithm [9] in a high-level language. Then, optimized significantly. As shown in Fig.2, the we show how, using only local optimization and (final) inverse QFT of one addition and the initial rewriting rules, the automatically generated low- QFT of the next one can be canceled. Further- level code can be as efficient as the manually op- more, some of the (controlled) phase-shift gates timized one in [10], although all subroutines were inside Φadd may be merged (depending on the implemented as stand-alone components. two constants to add).

Accepted in Quantum 2018-01-21, click title to verify 3 Simulator Emulator Quantum Main Back-end Optimizer Translator Optimizer . . . Mapper Hardware Program Engine interface Circuit drawer Resource est. eDSL in Python Compiler Back-ends

Figure 3: ProjectQ’s full stack software framework. Users write their quantum programs in a high-level domain-specific language embedded in Python. The quantum program is then sent to the MainEngine, which is the front end of the modular compiler. The compiler consists of individual compiler engines which transform the code to the low-level instruction sets supported by the various back-ends, such as interfaces to quantum hardware, a high-performance and emulator, as well as a circuit drawer and a resource counter.

While the potential for these cancellations and in state |0i and applies a Hadamard gate be- optimizations is easy to see at this level of ab- fore measuring it in the computational basis and straction, carrying out such a cancellation is com- printing the outcome. While this is a valid quan- putationally very expensive once all gates have tum program implementing a random number been translated to a low-level gate set, and may generator, a more pythonic and better designed be impossible to do after translating into a dis- version is shown in code example1. crete gate set (which introduces approximation Line 1 imports the MainEngine class, which is errors). In our compilation framework, we thus the front end of the quantum compiler as shown define several intermediate gate sets. At every in Fig.3. Every quantum program needs to cre- such intermediate level, inexpensive local opti- ate one MainEngine, which contains all compiler mization algorithms can be employed prior to fur- components (engines) as well as the back-end (see ther translation into the next lower-level repre- line 4). In section 4.2 we show how to select com- sentation. piler engines and the back-ends. If a MainEngine The ProjectQ compiler is modular and allows is created without any arguments as done here, new compilers to be built by combining existing the default compiler engines are used with a sim- and new components, as shown in Fig.3. This ulator back-end. design allows to customize intermediate gate sets Every operates on to improve optimization for specific algorithmic qubits, which are obtained by calling the primitives. It also allows to adapt the compila- allocate_qubit function of the MainEngine. tion process to different quantum hardware ar- To apply a Hadamard gate and then measure, chitectures by replacing just some of the com- we first need to import these gates in line 2 and piler engines (including hardware-specific map- apply them to the qubit in lines 6 and 7. The pers), which maximizes the re-use of individual syntax for applying a quantum gate to a qubit compiler components. mimics an operator notation. For example,

Rx(0.5) | qubit i , (1) 4 Features might indicate a rotation around the x-axis ap- In this section we will introduce the main features plied to a qubit. In ProjectQ, this is coded as: of ProjectQ, starting with a minimal example: Rx(0.5) | qubit

1 from projectq import MainEngine 2 from projectq.ops import H, Measure The symbol | separates the gate operation with 3 optional classical arguments (e.g., rotation an- 4 eng = MainEngine() gles) on the left from its quantum arguments on 5 qubit1 = eng.allocate_qubit() 6 H | qubit1 the right (the qubits to which the gate is being 7 Measure | qubit1 applied). 8 print ( int ( qubit1 )) Finally, on line 8, qubit1 is converted to an int. This conversion operation returns the mea- This minimal code example allocates one qubit surement result from line 7 which is then printed.

Accepted in Quantum 2018-01-21, click title to verify 4 1 import projectq.setups.default ← # explicitly import default decompositions 2 from projectq import MainEngine ← # import the main compiler engine 3 from projectq.ops import H, Measure ← # import the required operations 4 5 def my_rng(eng): ← # Python function definition 6 qubit1 = eng.allocate_qubit() ← # allocate qubit1 7 H | qubit1 ← # applya Hadamard gate to qubit1 8 Measure | qubit1 ← # measure qubit1 9 eng.flush() ← # force-execute all gates 10 return int ( qubit1 ) ← # access measurement result( conversion to int) 11 12 if __name__ == "__main__": ← # only executes if this is main 13 eng = MainEngine() ← # createa main compiler engine 14 print ("Result: {}". format (my_rng(eng))) ← # call my_rng(eng) and print the result

Code Example 1: Code example for implementing a quantum random number generator in our Python-embedded domain-specific language.

4.1 High level quantum language location by invoking Python’s del statement. In addition to removing the burden of resource man- 4.1.1 The basic (quantum) types agement from the user, letting the compiler han- The fundamental types of our high level quan- dle the life-time of qubits allows automatic par- tum language are logical qubits from which allelization for back-ends featuring more qubits more complex types can built: quantum integers than the minimal circuit width. The simulator (quint), quantum fixed point numbers (qufixed) can be used as a debugging tool to validate un- and quantum floats (qufloat). Similar to their compute sections since it throws an exception classical counterparts, these types are just differ- whenever a qubit in superposition is being deal- ent interpretations of the contents of an underly- located. Thus, qubits have to be either measured ing list of quantum bits, i.e., a quantum register or uncomputed prior to deallocation. or qureg. Certain subroutines such as the multi- To acquire an instance of an n-qubit quantum controlled NOT construction by Barenco et register, the function allocate_qureg(n) of the al. [14] or the constant-addition circuit by Häner MainEngine has to be invoked with the number et al. [12] do not require clean ancilla qubits in of qubits as a parameter, as in the following code a defined computational basis state (such as |0i) snippet: but work with borrowed qubits in an unknown arbitrary quantum state. They guarantee that ... after completion of the circuit, these so-called eng = MainEngine() dirty ancilla qubits have returned to their starting ... state. Our compiler can thus optimize the alloca- qureg = eng.allocate_qureg(n) tion of such ancilla qubits by simply providing a qubit which is currently unused, independent of The function allocate_qubit() returns a quan- its state: tum register of length one, which is a single qubit. Qubits can be allocated at any point in the pro- qubit = eng.allocate_qubit(dirty=True) gram, which keeps the user from having to specify a-priori the maximum number of qubits needed in any part of the code. 4.1.2 Quantum gates and functions ProjectQ’s compiler takes care of automatic deallocation of qubits by exploiting Python’s Operations on quantum data types can be imple- garbage collection, thus allowing qubit re-use. mented either as normal Python functions or as While not necessary, the user can still force deal- ProjectQ gates. A Python function implement-

Accepted in Quantum 2018-01-21, click title to verify 5 ing such an operation applies other gates or func- the compiler can then choose the best one for tions, as shown in this example of an addition by the target back-end by evaluating a user-defined a constant: cost function. Furthermore, if the back-end na- tively supports certain gate operations, such as def add_constant(quint, c): a many-qubit Mølmer-Sørensen gate [15] on ion QFT | quint trap quantum computers, the decomposition step # addition in the phases: may be skipped altogether. The AddConstant phi_add(quint, c) gate is also natively supported in our quantum emulator, which allows faster execution by orders get_inverse(QFT) | quint of magnitude compared to simulating the individ- ual gates of its implementation [4]. which can then be called as Most generally, the quantum input to any gate is a tuple of quantum types (qubit, qureg, add_constant(my_quint, 11) quint,...), which allows the following intuitive syntax for a Multiply instruction: The second approach is to define a custom Pro- jectQ gate representing the entire operation and Multiply | (quint_a, quint_b, res) then registering a decomposition rule, i.e., pro- viding one possible function which can be used to replace the newly defined gate: 4.1.3 Meta-instructions class AddConstant(BasicGate): At an even higher level of abstraction, complex def __init__(self, c): gate operations can be modified using so-called self .c = c # store constant to add meta-instructions. These facilitate optimization # provide one possible decomposition: processes in the compiler while allowing the user register_decomposition(AddConstant, to write more concise code. add_constant_decomposition1) All of these meta-instructions are implemented using the Python context-handler syntax, i.e., This enables the usual syntax for quantum gates, namely: with MetaInstructionObject: ... AddConstant(11) | my_quint ... where the constant to add (11 in this case) is where MetaInstructionObject is one of the fol- stored within the AddConstant gate object. lowing: While defining a custom gate involves more Control(eng, control_qubits): Condi- code, it is superior to a function-based implemen- • tion an entire code block on one or several tation and results in cleaner user code. The gate- qubits being in state |1i. based approach allows optimizations at a higher level of abstraction since the function call exe- Compute(eng)/CustomUncompute(eng): cutes the individual operations right away, thus • Annotate compute / uncompute sections as removing the potential of optimizing at the high- depicted in Fig.4. This allows to optimize est level. In our example, defining the constant- the conditional execution of such sections adder as a gate allows merging two consecutive as discussed in Ref. [2]. For an automatic AddConstant gates by adding the respective con- uncompute, simply run the Uncompute(eng) stants. This is much harder in the function-based function. implementation, where the highest level of ab- straction which the compiler receives is at the Dagger(eng): Invert an entire unitary code • level of QFT and phase-shift gates. block (hence the name: U †U = 1). Another advantage of implementing a new quantum operation, such as AddConstant, as a new ProjectQ gate is that different specialized decomposition rules can be defined from which

Accepted in Quantum 2018-01-21, click title to verify 6 Yet, as discussed in the previous section, op- U V U † timizing at different levels of abstraction allows for more efficient code. Introducing multiple in- termediate levels can be achieved using several compute action uncompute translators, optimizers, and instruction filter en- Figure| {z 4: A} compute/action/uncompute| {z } | {z } section. Since gines, which define the set of supported gates at U †U = 1, the U and U † gate can be executed each level: unconditionally when control qubits are added to an operation of this form. These sections can be eng = MainEngine(engine_list=[ annotated using our Compute, Uncompute, and AutoReplacer(),InstructionFilter( CustomUncompute meta-instructions, see Sec. 4.1.3. intermediate_gate_set), LocalOptimizer(),AutoReplacer(), LocalOptimizer()]) Loop(eng, num_iterations): Run a code • block num_iterations times. Back-ends na- As quantum operations propagate through this tively supporting loop instructions will re- compilation chain, the AutoReplacer engines de- ceive the loop body only once, whereas the compose all instructions into the gate set de- loop is unrolled otherwise. Some hardware fined by the next engine, which is, e.g., an platforms exploit this by re-using the gener- InstructionFilter. The last engine is always ated waveforms. the back-end, which defines the gate set at the lowest level. For example, an intermediate gate For an example employing some of these meta- set supporting high-level QFT gates allows the instructions to arrive at the performance-level of optimization mentioned in Sec.3 easily: consec- hand-optimized code, see our implementation of utive (controlled) additions in Fourier space can Shor’s algorithm for factoring in the Appendix. be executed more efficiently by canceling a QFT Any quantum program implemented in our with its inverse. eDSL concludes with the statement Once suitable compiler engines and decompo- sition rules have been determined for a specific eng.flush() back-end, those can be saved as a setup for fu- ture use. which makes sure that the entire circuit has Unlike other quantum compiler designs, our passed through all compiler engines and is re- compiler does not store the entire quantum cir- ceived by the back-end. cuit which, given the vast number of logical gates required for some quantum algorithms, would 4.2 Compiler and compiler engines not be feasible due to memory requirements. Instead, every compiler engine can define how Our compiler is not a monolithic block, but has a many gates it stores. While, for example, an modular design, allowing fast adaptation to new AutoReplacer works on only one gate at a time, hardware while optimally re-using existing com- a LocalOptimizer saves a short sequence of gates ponents. The compiler MainEngine serves as the before trying to optimize them. This locality of front-end of the quantum compiler and consists compilation allows to parallelize the compilation of a chain of compiler engines, each carrying out process in a straight-forward manner. one specific task. This chain can be customized by the user. The most trivial compiler consists 4.3 Back-ends of one translation engine called AutoReplacer, which decomposes a into the na- Our compiler supports a wide range of back-ends tive gate set of the back-end using the registered to decomposition rules (see section 4.1). Such a run circuits on quantum hardware, compiler can be instantiated as follows: • simulate the individual gates of a quantum • eng = MainEngine(engine_list=[ circuit AutoReplacer()]) emulate the action of a quantum circuit by • employing high-level shortcuts

Accepted in Quantum 2018-01-21, click title to verify 7 estimate the required resources fects of noise. We are working on the imple- • draw a quantum circuit mentation of stochastic noise models and a high- • performance density matrix simulator. The default simulation back-end can be changed by specifying the backend-parameter. To target 4.3.3 Emulation of quantum circuits the IBM Quantum Experience back-end instead, one simply writes For efficient testing of quantum algorithms at a high level of abstraction, our simulator provides eng = MainEngine(backend=IBMBackend()) quantum emulation features as well: By speci- fying the level of abstraction at which to emu- late (using, e.g., an InstructionFilter and an AutoReplacer, see Sec. 4.2), these features can 4.3.1 Hardware back-end: IBM quantum experi- be enabled or disabled. As an example, con- ence sider our AddConstant gate from Sec. 4.1: With a ProjectQ comes with a hardware backend for the small modification to the gate definition, the ad- IBM Quantum Experience device [1]. As an ex- dition can be carried out directly, without hav- ample program, consider entangling 3 qubits by ing to decompose it into QFT and phase-shift first allocating 3 qubits in a quantum register, gates (and then further into 1- and 2-qubit gates then applying the Entangle-operation to them, only). Gates which execute classical mathemat- and finally measuring the quantum register: ical functions on a superposition of values can derive from BasicMathGate and then provide a qureg = eng.allocate_qureg(3) Python function mimicking this behavior in the Entangle | qureg __init__ function of the AddConstant gate, i.e., Measure | qureg eng.flush() BasicMathGate.__init__(self, lambda x: return x + c) The compiler replaces the Entangle gate by a Hadamard gate on the first qubit, followed by Such shortcuts allow faster execution by orders CNOT gates on all others conditioned on the first of magnitude, especially if potential low-level im- qubit (see Fig.5 a)). The compiler then automat- plementations require many ancilla qubits to per- ically flips CNOT gates where necessary (using 4 form the computation. These extra qubits do Hadamard gates), to make the code compatible not need to be simulated when using this kind with the connectivity of the IBM quantum chip of shortcut, which allows to factor the number (see Fig.5 b)). In Fig.5 c), the circuit is opti- mized before the final mapping to physical qubits 4, 028, 033 = 2, 003 · 2, 011 is performed in Fig.5 d). Running this circuit yields the outcomes with their respective proba- on a regular laptop in less than 3 minutes. bilities in Fig.6. The emulation of quantum circuits is very use- ful for determining mesh-sizes, time steps/slices, 4.3.2 Simulation of quantum circuits and other high-level parameters for, e.g., quan- tum chemistry applications [16], which require Simulating quantum programs at the level of in- many mathematical functions such as arcsin(x), dividual gates can be achieved using our high- √1 exp(x), x , etc., to be evaluated on a superpo- performance quantum simulator. The concrete sition of values. The ProjectQ emulation feature gate set to be used can be specified by the user. then allows to perform numerical studies using The current version of the simulator in ProjectQ the same code that was used to obtain resource supports AVX instructions and OpenMP threads. estimates; and this can be achieved by merely Our simulator is substantially faster than the one changing one line of code. we recently presented in Ref. [4], which already outperfomed all other existing quantum simula- 4.3.4 Resource estimation tors, see Fig.7. The simulation of quantum circuits at the level The ResourceCounter back-end can be inserted of gates is especially useful to simulate the ef- at any point in the compiler chain. There, it

Accepted in Quantum 2018-01-21, click title to verify 8 0 H 0 H H H H H 0 H 0 H H | i | i | i | i

0 0 H H 0 H H 0 H H | i −→ | i −→ | i −→ | i

0 0 H H 0 H H 0 H | i | i | i | i

(a) After decomposing. (b) After CNOT mapping. (c) After optimization. (d) After mapping to hardware.

Figure 5: Individual stages of compiling an entangling operation for the IBM back-end. The high-level Entangle-gate is decomposed into its definition (Hadamard gate on the first qubit, followed by a sequence of controlled NOT gates on all other qubits). Then, the CNOT gates are remapped to satisfy the logical constraint that controlled NOT gates are allowed to act on one qubit only, followed by optimizing and mapping the circuit to the actual hardware.

IBM QE chip 1 Simulator in [4] 0.5 IBM QE Simulator ProjectQ simulator

0.4

] 0.1 s

0.3 0.01 5 Speedup Probability 0.2 Time per run [ 4 0.1 0.001 3 15 16 17 18 19 20 21 22 0 000 001 010 011 100 101 110 111 15 16 17 18 19 20 21 22 Measurement outcome Number of qubits

Figure 6: Measurement outcomes with their respective Figure 7: Runtime comparison of the simulator from [4] probabilities when running an entangle operation on to the simulator in ProjectQ. The timed circuit consists three qubits of the IBM Quantum Experience chip and a Hadamard-transform, a chain of controlled simulator. The entangle operation corresponds to z-rotations, and a final Hadamard transform. The lazy applying a Hadamard gate to the first qubit, followed evaluation of gates in combination with intrinsics by CNOT gates on all other qubits conditioned on the instructions allows the ProjectQ simulator to execute first qubit. The perfect outcome would be 50% this circuit between 3 and 5 times faster. Both simulators were run on both cores of an Intel R CoreTM all-zeros and 50% all-ones, as it is the case for the (noise-less) simulation. i7-5600U CPU.

ate compiler engine in order to draw the circuit keeps track of all gates it encounters and mea- at various levels of abstraction (which was done sures the maximal circuit width at that level as to create Fig.5). This back-end will also be ex- well. In order to get resource estimates for very tended with further features in future versions of large circuits, caching must be introduced at ev- ProjectQ, giving the user yet more power over the ery intermediate layer / gate set. This feature mapping & drawing process. will be added in the near future.

4.3.5 Circuit drawing back-end 5 Road map

Last but not least, all quantum programs can In order to widen the scope of available tools, be drawn as circuit diagrams for publication in, we will be extending ProjectQ on a regular basis e.g., an algorithmic paper. The CircuitDrawer with libraries, additional compiler engines, and back-end saves the circuit and, upon invocation of more hardware back-ends. the get_latex() member function, returns TikZ- LAT Xcode ready to be published. See Fig.8 for E 5.1 Libraries an example output depicting quantum telepor- tation (including the initial creation of a Bell- fermilib. Solving problems involving strongly in- pair). Just like the simulator or resource counter, teracting fermions is one of the most promis- this back-end can also be used as an intermedi- ing applications for near-term quantum devices.

Accepted in Quantum 2018-01-21, click title to verify 9 ψ H computers. | i 5.3 Compiler extensions 0 H | iA New compiler engines will be added, allowing to deal with more advanced layouting, which 0 Z ψ | iB | i is required to employ quantum error-correction schemes. This is crucial not only to run large Figure 8: Quantum circuit for . This circuit was automatically generated (modulo circuits on future hardware, but also to gain in- renaming of the qubits). formation about resource usage before large-scale quantum computers are available.

With external collaborators we are implementing fermilib [17], a library for designing quantum Acknowledgments simulation algorithms to treat fermionic systems. fermilib will include integration with open Special thanks go to the researchers from IBM, source electronic structure packages to enable the Lev S. Bishop, Fran Cabrera, Jorge Carballo, computation of arbitrary molecular Hamiltoni- Jerry M. Chow, Andrew W. Cross, Ismael Faro, ans. A well-engineered Python interface will al- Stefan Filipp, Jay M. Gambetta, Paco Martin, low for efficient manipulation of fermionic data Nikolaj Moll, Mark Ritter, and John Smolin for structures with routines enabling normal order- their help with interfacing to the IBM Quantum ing, fast orbital transformations, mapping qubit Experience chip. algebras, and more. In addition, it will provide We thank Jonathan Home, Matteo Marinelli, tools to study fermionic systems beyond chem- and Vlad Negnevitsky for working with us on an istry, including models for superconductivity such interface to their ion trap quantum computer. as the Hubbard model. Furthermore, we want to thank the follow- ing people for enlightening discussions: Jana mathlib. While our emulator can use classical Darulová, Michele Dolfi, Dominik Gresch, An- shortcuts to mimic the application of general dreas Hehn, Mario Könz, Natalie Pearson, Don- mathematical functions, manually-tuned high- jan Rodic, Slava Savenko, Andreas Wallraff, performance implementations of those functions and Camillo Zapata Ocampo from ETH Zurich; at the level of, e.g., Toffoli gates are still required Matthew Neeley, Daniel Sank, and Hartmut in order to get resource estimates or, at a later Neven from Google Quantum AI; Alan Geller, point in time, run those algorithms on real quan- Martin Roetteler, Krysta Svore, Dave Wecker, tum hardware. We will thus further extend the and Nathan Wiebe from Microsoft Research; and (small) existing math library in ProjectQ with Anne Matsuura and Mikhail Smelyanskiy from additional high-level quantum types (as discussed Intel. in Sec. 4.1) and corresponding operations. We would also like to thank Ryan Babbush, Jarrod McClean, and Ian D. Kivlichan for collab- 5.2 Back-ends orating with us on fermilib. We acknowledge support by the Swiss National We are working on adding support for further Science Foundation and the Swiss National Com- hardware back-ends. Among others, we aim to petence Center for Research, QSIT. have an interface to J. Home’s ion trap devices [18] in the near future. More hardware back-ends will follow soon. Hardware groups interested in References interfacing to ProjectQ are highly encouraged to contact us [8]. [1] IBM Quantum Experience. http:// Also, we will keep extending and improving our research..com/quantum/. classical back-ends. We are currently working on [2] Thomas Häner, Damian S. Steiger, Krysta including our distributed massively parallel quan- Svore, and Matthias Troyer. A software tum simulator [19], which allows to simulate up methodology for compiling quantum pro- to 45 qubits on one of the world’s largest super- grams. Quantum Science and Technology,

Accepted in Quantum 2018-01-21, click title to verify 10 2018. DOI: https://doi.org/10.1088/2058- putation, 17(7&8):0673–0684, 2017. DOI: 9565/aaa5cc. 10.26421/QIC17.7-8. [3] pybind. https://github.com/pybind. [13] Thomas G. Draper. Addition on a quantum [4] Thomas Häner, Damian S. Steiger, Mikhail computer. arXiv preprint quant-ph/0008033, Smelyanskiy, and Matthias Troyer. High 2000. performance emulation of quantum circuits. [14] Adriano Barenco, Charles H. Bennett, In Proceedings of the International Con- Richard Cleve, David P. DiVincenzo, Nor- ference for High Performance Computing, man Margolus, Peter Shor, Tycho Sleator, Networking, Storage and Analysis, SC ’16, John A. Smolin, and Harald Weinfurter. El- pages 74:1–74:9, Piscataway, NJ, USA, 2016. ementary gates for quantum computation. IEEE Press. ISBN 978-1-4673-8815-3. DOI: Physical Review A, 52(5):3457, 1995. DOI: 10.1109/SC.2016.73. 10.1103/PhysRevA.52.3457. [15] Anders Sørensen and Klaus Mølmer. Quan- [5] Alexander S. Green, Peter LeFanu Lums- tum computation with ions in thermal mo- daine, Neil J. Ross, Peter Selinger, and tion. Physical Review Letters, 82(9):1971, Benoît Valiron. Quipper: a scalable quan- 1999. DOI: 10.1103/PhysRevLett.82.1971. tum programming language. In ACM SIG- [16] Ryan Babbush, Dominic W. Berry, Ian D. PLAN Notices, volume 48, pages 333–342. Kivlichan, Annie Y. Wei, Peter J. Love, ACM, 2013. DOI: 10.1145/2499370.2462177. and Alán Aspuru-Guzik. Exponentially [6] Ali JavadiAbhari, Shruti Patil, Daniel more precise quantum simulation of fermions Kudrow, Jeff Heckey, Alexey Lvov, Fred- in second quantization. New Journal eric T. Chong, and Margaret Martonosi. of Physics, 18(3):033032, 2016. DOI: Scaffcc: a framework for compilation and 10.1088/1367-2630/18/3/033032. analysis of quantum computing programs. In [17] Fermilib. https://github.com/ Proceedings of the 11th ACM Conference on projectq-framework/fermilib. Computing Frontiers, page 1. ACM, 2014. [18] Ludwig E. de Clercq, Hsiang-Yu Lo, Mat- DOI: 10.1145/2597917.2597939. teo Marinelli, David Nadlinger, Robin Os- [7] Dave Wecker and Krysta M. Svore. LIQUi |i: wald, Vlad Negnevitsky, Daniel Kienzler, A software design architecture and domain- Ben Keitch, and Jonathan P. Home. Par- specific language for quantum computing. allel transport quantum logic gates with arXiv preprint arXiv:1402.4467, 2014. trapped ions. Physical Review Letters, [8] ProjectQ website. www.projectq.ch. 116(8):080502, 2016. DOI: 10.1103/Phys- [9] Peter W. Shor. Algorithms for quantum RevLett.116.080502. computation: Discrete logarithms and fac- [19] Thomas Häner and Damian S. Steiger. 0.5 toring. In Foundations of Computer Sci- petabyte simulation of a 45-qubit quantum ence, 1994 Proceedings., 35th Annual Sym- circuit. In Proceedings of the International posium on, pages 124–134. IEEE, 1994. DOI: Conference for High Performance Comput- 10.1109/SFCS.1994.365700. ing, Networking, Storage and Analysis, SC ’17, pages 33:1–33:10, New York, NY, USA, [10] Stephane Beauregard. Circuit for shor’s al- 2017. ACM. ISBN 978-1-4503-5114-0. DOI: gorithm using 2n+ 3 qubits. Quantum In- 10.1145/3126908.3126947. formation and Computation, 3(2):175–185, [20] Lov K. Grover. A fast quantum me- 2003. chanical algorithm for database search. [11] Yasuhiro Takahashi, Seiichiro Tani, and In Proceedings of the twenty-eighth annual Noboru Kunihiro. Quantum addition cir- ACM Symposium on Theory of Comput- cuits and unbounded fan-out. Quantum ing, pages 212–219. ACM, 1996. DOI: Information and Computation, 10(9&10): 10.1145/237814.237866. 0872–0890, 2010. [12] Thomas Häner, Martin Roetteler, and Krysta M. Svore. Factoring using 2n+2 qubits with Toffoli based modular multipli- cation. and Com-

Accepted in Quantum 2018-01-21, click title to verify 11 A Examples by adding a phase of −1 to the quantum state:

In this section, we will discuss complete exam- Uf |xi = |xi , x 6= e ples, which are also included with the ProjectQ Uf |ei = − |ei sources. and the other oracle is a reflection across the uni- form superposition. These two operators are then A.1 Quantum Teleportation √ π N applied iteratively 4 times, where N is the Quantum teleportation can be implemented as number of potential inputs to the function. follows: An example implementation of this algorithm with e = 1010...1012 (the binary representation # allocate2 qubits and turn them into of the solution is alternating) is available in the a Bell-pair(entangle them) b1 = eng.allocate_qubit() examples folder of the ProjectQ framework. The b2 = eng.allocate_qubit() loop performing the two reflections looks as fol- H | b1 lows: CNOT | (b1, b2) # run num_it iterations # Alice createsa nice state to send with Loop(eng, num_it): psi = eng.allocate_qubit() # addsa (-1)-phase to the solution create_state(psi) oracle(eng, x, oracle_out)

# entangle it with Alice’s b1 # reflection across uniform CNOT | (psi, b1) superposition: # map uniform superposition to all- # measure two values(once in Hadamard ones basis) and send the bits to Bob with Compute(eng): H | psi All (H) | x Measure | (psi, b1) All (X) | x print ("Message: {}". format ([ int ( psi ), # phase-flip for all-ones: int (b1)]) with Control(eng, x[0:-1]): Z | x[ -1] # Bob may have to apply up to two # undo mapping operation depending on the message Uncompute(eng) sent by Alice: with Control(eng, b1): X | b2 Where the Loop meta-instruction does not unroll with Control(eng, psi): the loop if the underlying hardware or further Z | b2 compiler engines support the execution or opti- # done. mization of loops.

When using the CircuitDrawer back-end, i.e.,

eng = MainEngine(CircuitDrawer()) the quantum circuit depicted in Fig.8 is gener- ated.

A.2 Grover search Grover’s search algorithm [20] achieves a quadratic speedup for finding an element e, given a function ( 1, x = e f(x) = 0, x 6= e

It requires two quantum oracles to be imple- mented; one is Uf , which marks the element e

Accepted in Quantum 2018-01-21, click title to verify 12 A.3 Shor’s algorithm for factoring # do each of the bits ofx separately, Our eDSL allows the user to implement more employing the semi-classical complex functions nicely while keeping the ef- inverseQFT ficiency of the resulting code at the level of a ctrl_qubit = eng.allocate_qubit() hand-optimized implementation. As an example, # loop over all2n bits consider the modular adder proposed by Beaure- for k in range (2 * n): gard [10], which can be implemented using our # shift thea we multiply by powerful meta-instructions: current_a = pow (a,1 << (2 *n -1 - k), N) def add_constant_modN(eng, c, N, quint) : #x is in uniform superposition: assert (c < N and c >= 0) H | ctrl_qubit

AddConstant(c) | quint # apply controlled modular multiplication with Compute(eng): with Control(eng, ctrl_qubit): SubConstant(N) | quint MultiplyByConstantModN(current_a, N ancilla = eng.allocate_qubit() ) | x CNOT | (quint[-1], ancilla) with Control(eng, ancilla): # perform inverseQFT -> Rotations AddConstant(N) | quint conditioned on previous outcomes for i in range (k): SubConstant(c) | quint if measurements[i]: R(-math.pi/(1 << (k - i))) | with CustomUncompute(eng): ctrl_qubit X | quint[-1] CNOT | (quint[-1], ancilla) # final Hadamard of the inverseQFT X | quint[-1] H | ctrl_qubit del ancilla # and measure AddConstant(c) | quint Measure | ctrl_qubit eng.flush()

A complete implementation of Shor’s algorithm # store the measurement result is also available in the examples folder of Pro- measurements[k] = int (ctrl_qubit) jectQ. Using the quantum math library, an im- plementation of this algorithm takes only a few # and reset the qubit for the next iteration lines. The iterative modular multiplication and if measurements[k]: shift, which is used to implement the modular X | ctrl_qubit exponentiation of a by a 2n bit quantum number x in a uniform superposition looks as follows:

Accepted in Quantum 2018-01-21, click title to verify 13