Arxiv:1908.04480V2 [Quant-Ph] 23 Oct 2020
Total Page:16
File Type:pdf, Size:1020Kb
Quantum adiabatic machine learning with zooming Alexander Zlokapa,1 Alex Mott,2 Joshua Job,3 Jean-Roch Vlimant,1 Daniel Lidar,4 and Maria Spiropulu1 1Division of Physics, Mathematics & Astronomy, Alliance for Quantum Technologies, California Institute of Technology, Pasadena, CA 91125, USA 2DeepMind Technologies, London, UK 3Lockheed Martin Advanced Technology Center, Sunnyvale, CA 94089, USA 4Departments of Electrical and Computer Engineering, Chemistry, and Physics & Astronomy, and Center for Quantum Information Science & Technology, University of Southern California, Los Angeles, CA 90089, USA Recent work has shown that quantum annealing for machine learning, referred to as QAML, can perform comparably to state-of-the-art machine learning methods with a specific application to Higgs boson classification. We propose QAML-Z, a novel algorithm that iteratively zooms in on a region of the energy surface by mapping the problem to a continuous space and sequentially applying quantum annealing to an augmented set of weak classifiers. Results on a programmable quantum annealer show that QAML-Z matches classical deep neural network performance at small training set sizes and reduces the performance margin between QAML and classical deep neural networks by almost 50% at large training set sizes, as measured by area under the ROC curve. The significant improvement of quantum annealing algorithms for machine learning and the use of a discrete quantum algorithm on a continuous optimization problem both opens a new class of problems that can be solved by quantum annealers and suggests the approach in performance of near-term quantum machine learning towards classical benchmarks. I. INTRODUCTION lem Hamiltonian, ensuring that the system remains in the ground state if the system is perturbed slowly enough, as given by the energy gap between the ground state and Machine learning has gained an increasingly impor- the first excited state [36{38]. The ground state of the tant role in scientific discovery across chemistry, biol- problem Hamiltonian is then the solution (as in adiabatic ogy, environmental science, and physics [1{5], including quantum computing [39, 40]), although thermal excita- in the discovery of the Higgs boson [6]. Various quantum tions may move the system out of the ground state [41{ computing algorithms have been proposed for machine 47],which can be beneficial [48{51]. It is crucial to ob- learning [7], including support vector machines, princi- serve that evidence of a quantum speedup in quantum pal component analysis, least-squares fitting, topological annealing remains uncertain [52{54], although quantum analysis, and other optimization problems [8{13]. Many phenomena have been observed in D-Wave quantum an- of these algorithms include strict data assumptions that nealers [55{57]. While this remains a speculative topic, provide critical caveats regarding sparsity, state prepa- quantum annealers may exhibit advantages other than a ration, and rank [14, 15]. Moreover, fault-tolerant quan- speedup, such as sampling from non-equilibrium distri- tum computing will be required to implement the large butions prepared during the anneal [58{60]. quantum circuits necessary for the proposed algorithms, which has not yet been experimentally established at a Here we propose a novel quantum algorithm inspired scale necessary for the implementation of the proposed by the previous state-of-the-art quantum annealing for algorithms. Similarly, quantum random access memory machine learning (QAML) algorithm [33], which con- (qRAM) is typically required to store classical data, but structs a single strong classifier from a linear combination engineering challenges persist in developing a sufficiently of weak classifiers with binary coefficients of 1 or 0. We large qRAM [16]. propose two modifications to QAML | zooming into the energy surface to optimize real-valued coefficients and ar- arXiv:1908.04480v2 [quant-ph] 23 Oct 2020 One promising near-term avenue for quantum machine tificially augmenting the set of weak classifiers to create learning is quantum annealing [17] (for recent reviews a stronger ensemble | and implement the proposed al- see [18{20]) which can, e.g., perform binary classifica- gorithm (QAML-Z) on the D-Wave quantum annealer to tion [21, 22], learn Bayesian network structure [23], im- benchmark the results on a Higgs classification problem, plement quantum Boltzmann machines [24], and train with available source code [61] and data [62]. deep generative models [25]. Quantum annealing is the only current quantum computing paradigm that has re- sulted in architectures with a large enough number of | II. QAML-Z ALGORITHM albeit relatively noisy | qubits [26{28] to address both real-world and fundamental science problems, e.g., in air traffic control [29], computational biology [30{32], and A. Background: QAML Algorithm high energy physics [33{35]. Under the adiabatic theorem of quantum mechanics, quantum annealing evolves from In the original QAML algorithm, a training set with an initial transverse field Hamiltonian to the target prob- S examples of labeled data fxτ ; yτ g (where xτ is an in- 2 put vector and yτ = ±1 is a binary label for signal and B. Zooming Extension background) is optimized with a set of N weak classifiers ci, each of which gives ci(xτ ) = ±1=N for a signal or By iteratively performing quantum annealing, the bi- background prediction. Given spins si 2 f0; 1g obtained nary weights on the weak classifiers can be made contin- by transforming up/down spins, let R(xτ ) be a strong uous, resulting in a stronger classifier. This is achieved classifier given by by performing a search on the real numbers, effectively N zooming in on a region of the energy surface each itera- X R(xτ ) = sici(xτ ); (1) tion (Figure1). We denote the zooming variant of quan- i=1 tum annealing for machine learning as QAML-Z. Under this reformulation, the weights of the classifiers may be i.e., an ensemble of the weak classifiers where each weak extended from the set f0; 1g to the continuous interval classifier is either turned on or off (weight 1 or 0). To [−1; 1], enabling the subtraction of classifiers to reduce minimize classification error, we simply minimize the dis- cross-correlations between weak classifiers. tance between y and R: 2 S N 2 X X jjy − Rjj = yτ − sici(xτ ) (2a) τ=1 i=1 E E E N S 2 X X = jjyjj − 2 sici(xτ )yτ i=1 τ=1 µ0(0) = 0 µ0(1) = –0.5 µ0(2) = –0.25 N N S X X X + sici(xτ )sjcj(xτ ): (2b) FIG. 1. Zooming extension. While QAML only performs i=1 j=1 τ=1 one anneal, QAML-Z iteratively updates the weight µ (indi- Removing the spin-independent term jjyjj2 and the self- cated by the red dot) of a weak classifier (index 0 in the dia- 2 gram) in the strong classifier ensemble by performing a binary spin interactions ci (xτ ) to construct a problem suitable for quantum annealing, we rewrite the Hamiltonian as search over the energy surface using spin up/down outcomes. follows (scaling by a factor of 2 for convenience after ma- nipulating indices): Let each qubit have a mean µi(t) (starting at µi(0) = 0 t N N S N S for all i) and let the search breadth be σ(t) = b , where X X X X X H = s c (x )s c (x ) − s c (x )y : t = 0; 1; :::; T − 1 for T iterations and 0 < b < 1 i i τ j j τ i i τ τ is a free parameter. Each iteration, the Hamiltonian i=1 j>i τ=1 i=1 τ=1 (3) is centered around the previous mean and the search For convenience, we define the variables: breadth is narrowed. Receiving spin up or spin down corresponds to shifting the new mean either right or S X left by a distance given by the search breadth. The Cij = ci(xτ )cj(xτ ); (4) weight given to each classifier is thus updated accord- τ=1 ing to the old mean and consequent shift, resulting in S X a modified Hamiltonian according to the substitution Ci = ci(xτ )yτ : (5) sici(xτ ) ! σ(t)sici(xτ ) + µi(t)ci(xτ ). The full expres- τ=1 sion is: Hence, in the original QAML algorithm, the following N N S X X X Ising model Hamiltonian is minimized after transforming H(t) = σ(t)s c (x ) + µ (t)c (x ) the range to s 2 {−1; 1g, adding an additional λ regu- i i τ i i τ i i=1 j>i τ=1 larization hyperparameter to penalize nonzero si [22]: × σ(t)sjcj(xτ ) + µj(t)cj(xτ ) N 0 N 1 N N X 1 X 1 X X N S H = @λ − Ci + CijA si + Cijsisj; X X 2 4 − σ(t)sici(xτ ) + µi(t)ci(xτ ) yτ (7a) i=1 j>i i=1 j>i i=1 τ=1 (6) N 0 N 1 X X We observe the following limitations in the QAML al- = @−Ci + µj(t)CijA σ(t)si gorithm: i) arbitrary linear combinations of weak clas- i=1 j=1 sifiers ci are forbidden because the strong classifier R is N N X X 2 simply formed by turning weak classifiers ci on or off; ii) + Cijσ (t)sisj; (7b) the diversity of the ensemble is limited by the selection i=1 j>i of weak classifiers. If the set of weak classifiers can be expanded, more nuanced ensembles with more complex where Eq. (7b) is derived after dropping constants from decision boundaries can be learned. the Hamiltonian and applying the same Ci and Cij no- 3 1 tation as in QAML. This new Hamiltonian may be it- Additionally, we set the zoom parameter b = 2 to per- eratively optimized for t = 0; 1;:::;T − 1 to update form a binary search over the real numbers.