Variational Learning for Quantum Artificial Neural Networks

Variational learning for quantum artificial neural networks Francesco Tacchino∗x{, Stefano Manginiyk{, Panagiotis Kl. Barkoutsos∗, Chiara Macchiavelloyk∗∗, Dario Geracey, Ivano Tavernelli∗ and Daniele Bajoniz ∗IBM Quantum, IBM Research – Zurich, 8803 Ruschlikon,¨ Switzerland yUniversity of Pavia, Department of Physics, via Bassi 6, 27100 Pavia, Italy zUniversity of Pavia, Department of Industrial and Information Engineering, via Ferrata 1, 27100 Pavia, Italy kINFN Sezione di Pavia, Via Bassi 6, I-27100, Pavia, Italy ∗∗CNR-INO - Largo E. Fermi 6, I-50125, Firenze, Italy xEmail: [email protected] {These authors contributed equally to this work. Abstract—In the last few years, quantum computing and networks [12]–[21], the realization of quantum Support Vector machine learning fostered rapid developments in their respec- Machines (qSVMs) [22] working in quantum-enhanced feature tive areas of application, introducing new perspectives on how spaces [23], [24] and the introduction of quantum versions of information processing systems can be realized and programmed. The rapidly growing field of Quantum Machine Learning aims artificial neuron models [25]–[32]. However, it is true that at bringing together these two ongoing revolutions. Here we first very few clear statements have been made concerning the review a series of recent works describing the implementation of concrete and quantitative achievement of quantum advantage artificial neurons and feed-forward neural networks on quantum in machine learning applications, and many challenges still processors. We then present an original realization of efficient need to be addressed [8], [33], [34]. individual quantum nodes based on variational unsampling protocols. We investigate different learning strategies involving In this work, we review a recently proposed quantum global and local layer-wise cost functions, and we assess their algorithm implementing the activity of binary-valued artificial performances also in the presence of statistical measurement neurons for classification purposes. Although formally exact, noise. While keeping full compatibility with the overall memory- this algorithm in general requires quite large circuit depth efficient feed-forward architecture, our constructions effectively for the analysis of the input classical data. To mitigate for reduce the quantum circuit depth required to determine the activation probability of single neurons upon input of the relevant this effect we introduce a variational learning procedure, data-encoding quantum states. This suggests a viable approach based on quantum unsampling techniques, aimed at critically towards the use of quantum neural networks for pattern classi- reducing the quantum resources required for its realization. By fication on near-term quantum hardware. combining memory-efficient encoding schemes and low-depth quantum circuits for the manipulation and analysis of quantum I. INTRODUCTION states, the proposed methods, currently at an early stage In classical machine learning, artificial neurons and neural of investigation, suggest a practical route towards problem- networks were originally proposed, more than a half century specific instances of quantum computational advantage in ago, as trainable algorithms for classification and pattern machine learning applications. recognition [1], [2]. A few milestone results obtained in subsequent years, such as the backpropagation algorithm [3] II. A MODEL OF QUANTUM ARTIFICIAL NEURONS and the Universal Approximation Theorem [4], [5], certified The simplest formalization of an artificial neuron can be the potential of deep feed-forward neural networks as a com- given following the classical model proposed by McCulloch arXiv:2103.02498v1 [quant-ph] 3 Mar 2021 putational model which nowadays constitutes the cornerstone and Pitts [1]. In this scheme, a single node receives a set of of many artificial intelligence protocols [6], [7]. m binary inputs fi0; : : : ; im−1g 2 {−1; 1g , which can either be In recent years, several attempts were made to link these signals from other neurons in the network or external data. The powerful but computationally intensive applications to the computational operation carried out by the artificial neuron rapidly growing field of quantum computing, see also Ref. [8] consists in first weighting each input by a synapse coefficient for a useful review. The latter holds the promise to achieve wj 2 {−1; 1g and then providing a binary output O 2 {−1; 1g relevant advantages with respect to classical machines al- denoting either an active or rest state of the node determined ready in the near term, at least on selected tasks including by an integrate-and-fire response e.g. chemistry calculations [9], [10], classification and op- ( timization problems [11]. Among the most relevant results 1 if P w i ≥ θ O = j j j (1) obtained in Quantum Machine Learning it is worth mentioning −1 otherwise the use of trainable parametrized digital and continuous- variable quantum circuits as a model for quantum neural where θ represents some predefined threshold. A quantum procedure closely mimicking the functionality It is important to mention that this preparation step would most of a binary valued McCulloch-Pitts artificial neuron can be effectively be replaced by, e.g., a direct call to a quantum designed by exploiting, on one hand, the superposition of com- memory [38], or with the supply of data encoding states putational basis states in quantum registers, and on the other readily generated in quantum form by quantum sensing devices hand the natural non-linear activation behavior provided by to be analyzed or classified. It is indeed well known that quantum measurements. In this section, we will briefly outline the interface between classical data and their representation a device-independent algorithmic procedure [28] designed to on quantum registers currently constitutes one of the major implement such a computational model on a gate-based quan- bottlenecks for Quantum Machine Learning applications [8]. tum processor. More explicitly, we show how classical input Let now Uw be a unitary operator such that and weight vectors of size m can be encoded on a quantum ⊗N Uwj wi = j1i = jm − 1i (6) hardware by using only N = log2 m qubits [28], [35], [36]. For loading and manipulation of data, we describe a protocol In principle, any m×m unitary matrix having the elements of based on the generation of quantum hypergraph states [37]. ~w appearing in the last row satisfies this condition. If we apply This exact approach to artificial neuron operations will be used Uw after Ui, the overall N-qubits quantum state becomes in the main body of this work as a benchmark to assess the m−1 performances of approximate variational techniques designed X U j i = j i ≡ j i to achieve more favorable scaling properties in the number of w i cj j φi;w (7) logical operations with respect to classical counterparts. j=0 Let ~i and ~w be binary input and weight vectors of the form Using Eq. (6), we then have 0 1 0 1 y i0 w0 h wj ii = h wjU Uwj ii = w (8) B i1 C B w1 C = hm − 1jφ i = c ~ = B C = B C (2) i;w m−1 i B . C ~w B . C @ . A @ . A We thus see that, as a consequence of the constraints imposed im−1 wm−1 to Ui and Uw, the desired result ~i · ~w / h wj ii is contained N up to a normalization factor in the coefficient cm−1 of the final with ij; wj 2 {−1; 1g and m = 2 . A simple and qubit- effective way of encoding such collections of classical data state jφi;wi. can be given by making use of the relative quantum phases The final step of the algorithm must access the computed (i.e. factors ±1 in our binary case) in equally weighted input-weight scalar product and determine the activation state superpositions of computational basis states. We then define of the artificial neuron. In view of constructing a general the states architecture for feed-forward neural networks [30], it is useful m−1 to introduce an ancilla qubit , initially set in the state j0i, on 1 X a j ii = p ijjji which the c / h j i coefficient can be written through m m−1 w i j=0 a multi-controlled NOT gate, where the role of controls is (3) m−1 1 X assigned to the N encoding qubits [28]: j wi = p wjjji m m−2 j=0 X jφi;wij0ia ! cjjjij0ia + cm−1jm − 1ij1ia (9) where, as usual, we label computational basis states with j=0 integers j 2 f0; : : : ; m − 1g corresponding to the decimal At this stage, a measurement of qubit a in the computational representation of the respective binary string. The set of all basis provides a probabilistic non-linear threshold activation possible states which can be expressed in the form above is behavior, producing the output j1i state, interpreted as an known as the class of hypergraph states [37]. a active state of the neuron, with probability jc j2. Although According to Eq. (1), the quantum algorithm must first m−1 this form of the activation function is already sufficient to perform the inner product ~i · ~w. It is not difficult to see carry out elementary classification tasks and to realize a logical that, under the encoding scheme of Eq. (3), the inner product XOR operation [28], more complex threshold behaviors can in between inputs and weights is contained in the overlap [28] principle be engineered once the information about the inner ~w ·~i product is stored on the ancilla [27], [29]. Equivalently, the h wj ii = (4) m ancilla can be used, via quantum controlled operations, to We can explicitly compute such overlap on a quantum register pass on the information to other quantum registers encoding successive layers in a feed-forward network architecture [30]. through a sequence of ~i- and ~w-controlled unitary operations. First, assuming that we operate on a N-qubit quantum register It is worth noticing that directing all the relevant information starting in the blank state j0i⊗N , we can load the input- into the state of a single qubit, besides enabling effective quantum synapses, can be advantageous when implementing the encoding quantum state j ii by performing a unitary trans- procedure on real hardware on which readout errors constitute formation Ui such that a major source of inaccuracy.

Load more