On the Principles of Differential Quantum Programming Languages
Total Page:16
File Type:pdf, Size:1020Kb
On the Principles of Differential Quantum Programming Languages Shaopeng Zhu Shih-Han Hung Shouvanik Chakrabarti Xiaodi Wu University of Maryland, College Park, USA Abstract where near-term Noisy Intermediate-Scale Quantum (NISQ) Variational Quantum Circuits (VQCs), or the so-called quan- computers [37], e.g., the 53-qubit quantum machine from tum neural-networks, are predicted to be one of the most im- Google [2] and IBM [20], becomes the important platform for portant near-term quantum applications, not only because of demonstrating quantum applications. Variational quantum their similar promises as classical neural-networks, but also circuits (VQCs) [15, 16, 35], or the so-called quantum neural because of their feasibility on near-term noisy intermediate- networks, are predicted to be one of the most important ap- size quantum (NISQ) machines. The need for gradient infor- plications on NISQ machines. This is not only because VQCs mation in the training procedure of VQC applications has bear a lot of similar promises like classical neural networks as stimulated the interest and development of auto-differentiation well as potential quantum speed-ups from the perspective of techniques on quantum circuits. We propose the first formal- machine learning (e.g., see the survey [8]), but also because ization of this technique, not only in the context of quantum VQC is, if not the only, one of the few candidates that can circuits but also for imperative quantum programs (e.g., with be implemented on NISQ machines. Because of this, a lot of controls), inspired by the success of differential programming study has already been devoted to the design, analysis, and languages in classical machine learning. In particular, we small-scale implementation of VQC (e.g., see the survey [6]). overcome a few unique difficulties caused by exotic quantum Typical VQC applications replace classical neural net- features (such as quantum no-cloning) and provide a rigor- works, which are just parameterized classical circuits, by ous formulation of differentiation applied to bounded-loop quantum circuits with classically parameterized unitary gates. imperative quantum programs, its code-transformation rules, Namely, one will have a "quantum" mapping from input to as well as a sound logic to reason about their correctness. output replacing classical mapping in machine learning ap- Moreover, we have implemented our code transforma- plications. An important component of these applications tion in OCaml and demonstrated the resource-efficiency of is a training procedure which optimizes a loss function the our scheme both analytically and empirically. We also con- now depends on the read-outs and the parameters of VQCs. duct a case study of training a VQC instance with controls, It is well-known that gradient-based approaches are the which shows the advantage of our scheme over existing best for the training procedure. However, computing the auto-differentiation on quantum circuits without controls. gradients of loss functions that depend on the read-outs of quantum circuits has a similar complexity of simulating quan- Keywords quantum programming languages, differential tum circuits, which is infeasible for classical computation. programming languages, auto differentiation, quantum ma- Thus, the ability of evaluating these "quantum" gradients effi- chine learning, collection of semantics ciently by quantum computation is critical for the scalability of VQC applications, e.g., on the recent 53-qubit machines. 1 Introduction Fortunately, analytical formulas of gradients used in VQCs Background. Recent years have witnessed the rapid de- have been studied by [16, 19, 29, 41, 42]. In particular, Schuld velopment of quantum computing, with practical advances et al. [41] proposed the so-called phase-shift rule that uses coming from both research and industry. Quantum program- two quantum circuits to compute the partial derivative re- ming is one topic that has been actively investigated. Early spective to one parameter for quantum circuits. One of the work on language design [22, 32, 39, 40, 44] has been fol- very successful tools for quantum machine learning, called lowed up recently by several implementations of these lan- PennyLane [7], implemented the phase-shift rule to achieve guages, including Quipper [24], Scaffold [1], LIQUiji [49], auto differentiation (AD) for the read-outs of quantum cir- Q# [46], and QWIRE [33]. Extensions of program logics have cuits. It also integrated automatic differentiation from es- also been proposed for verification of quantum programs tablished machine learning libraries such as TensorFlow or [3, 9, 10, 17, 27, 52, 55]. See also surveys [18, 43, 53]. PyTorch for any additional classical calculation in the train- With the availability of prototypes of quantum machines, ing procedure. However, none of these research was con- especially the recent establishment of quantum supremacy [2], ducted from the perspective of programming languages and the research of quantum computing has entered a new stage no rigorous foundation or principles have been formalized. Draft paper, November 22, 2019, Motivations. An important motivation of this paper is to 2020. provide a rigorous formalization of the auto-differentiation 1 Draft paper, November 22, 2019, S. Zhu, S. Hung, S. Chakrabarti &X.Wu technique applied to quantum circuits. In particular, we natural choice also serves the purpose of gradients computa- will provide a formal formulation of quantum programs, tion of loss functions in quantum machine learning applica- their semantics, and the meaning of differentiation on them. tions, which are typically phrased in terms of these read-outs. We will also study the code-transformation rules for auto- We also directly model the parameterization of quantum pro- differentiation and prove their correctness. grams after VQCs, i.e., each unitary gate becomes classically As we will highlight below, research on the formalization parameterized. The above modeling of quantum programs is will encounter many new challenges that have not been con- very different from classical ones. It is also unclear whether sidered and addressed by existing results [16, 19, 29, 41, 42]. any reasonable analogues of classical chain-rule and for- Consider one of the basic requirements, e.g., composition- ward/backward mode can exist within quantum programs. ality. As we will show, differentiating the composition of To model the meaning of one quantum program com- quantum programs will necessarily involve running multi- puting the derivative of another, we define the differential ple quantum programs on copies of initial quantum states. semantics of programs. There is a subtle quantum-unique How to represent the collection of quantum programs suc- design choice. The observable semantics of any quantum pro- cinctly and also bound the number of required copies is a gram will depend on the observable and its input state. Thus, totally new question. Among our techniques to address this any program computing its derivative could potentially de- question, we also need to change the previously proposed pend on these two extra factors. We find out this potential construct, e.g., the phase-shift rule [41], to something else. dependence is undesirable and propose the strongest possi- Moreover, we want to go beyond the restriction to quan- ble definition: i.e., one derivative computing program should tum circuits in [16, 19, 29, 41, 42]. We are inspired by classical work for any pair of observable and input states. We demon- machine learning examples that demonstrate the advantage strate that this strong requirement is not only achievable but of neural-networks with program features (e.g. controls) also critical for the composition of auto differentiation. over the plain ones (e.g., classical circuits) [23, 25], which is We are ready to describe the technical challenges for the also the major motivation of promoting the the paradigm compositionality. Consider the following quantum program: shift from deep learning towards differential programming QMUL ≡ U ¹θº;U ¹θº; (1.4) by prominent machine learning researchers. 1 2 Augmenting VQCs with controls is not only feasible on which performs U1¹θº and U2¹θº gates sequentially. Note NISQ machines (at least for simple ones), but also now a that gate application is matrix multiplication in quantum. logical step for the study of their applications in machine Roughly speaking, if the product rule of differentiation (as learning. (We indeed conduct one case study in Section 8.) exhibited in (1.3)) remains in the quantum setting, at least Thus, we are inspired to investigate the principles of dif- @ symbolically, then one should expect @θ ¹QMULº contains ferential quantum imperative languages beyond circuits. @ @ Research challenges & Solutions. We will rely on a little ¹U1¹θºº;U2¹θº and U1¹θº; ¹U2¹θºº (1.5) @θ @θ notation that should be self-explanatory. Please refer to a detailed preliminary on quantum information in Section 2. two different parts as sub-programs similarly in (1.3). @ ¹ ¹ ºº ¹ º @ ¹ ¹ ºº Let us start with a simple classical program However, we cannot run @θ U1 θ ;U2 andU1 θ ; @θ U2 θ together due to the quantum no-cloning theorem [51]. This MUL ≡ v3 = v1 × v2; (1.1) is because they share the same initial state and we cannot clone two copies of it and maintain the correct correlation where v3 is the product of v1 and v2. Consider the differenti- among different parts. Note that this is not an issue classi- ation with respect to θ, we have cally as we can store all vi;vÛi at the same time as in (1.3). @ As a result, quantum differentiation needs to run multiple ¹MULº ≡ v = v × v ; (1.2) @θ 3 1 2 (sub-)programs on multiple copies of the initial state. This poses a unique challenge for differentiation of quan- vÛ3 = vÛ1 × v2 + v1 ×v Û2; (1.3) tum composition: (1) we hope to have a simple scheme of where MUL keeps track of variablesv1;v2;v3 and their deriva- code transformation, ideally close-to-classical, for intuition tives vÛ1;vÛ2;vÛ3 at the same time.