A Simple Quantum Neural Net with a Periodic Activation Function Ammar Daskin Department of Computer Engineering, Istanbul Medeniyet University, Kadikoy, Istanbul, Turkey Email: adaskin25-at-gmail-dot-com Abstract—In this paper, we propose a simple neural net that phase estimation to imitate the output of a classical perceptron requires only O(nlog2k) number of qubits and O(nk) quantum where the binary input is mapped to the second register of the gates: Here, n is the number of input parameters, and k is algorithm and the weights are implemented by phase gates. the number of weights applied to these parameters in the proposed neural net. We describe the network in terms of a The main problem in current quantum learning algorithms is quantum circuit, and then draw its equivalent classical neural to tap the full power of artificial neural networks into the n net which involves O(k ) nodes in the hidden layer. Then, quantum realm by providing robust data mapping algorithms we show that the network uses a periodic activation function from the classical realm to the quantum and processing this of cosine values of the linear combinations of the inputs and data in a nonlinear way similar to the classical neural networks. weights. The backpropagation is described through the gradient descent, and then iris and breast cancer datasets are used for It is shown that a repeat until success circuit can be used to the simulations. The numerical results indicate the network create a quantum perceptron with nonlinear behavior as a main can be used in machine learning problems and it may provide building block of quantum neural nets [22]. It is also explained exponential speedup over the same structured classical neural in Ref.[23] how mapping data into Hilbert space can help for net. kernel based learning algorithms. Index Terms —quantum machine learning, quantum neural The superposition is one of the physical phenomena that networks. allows us to design computationally more efficient quantum Neural networks are composed of many non-linear compo- algorithms. In this paper, we present a quantum neural net by nents that mimic the learning mechanism of a human-brain. fully utilizing the superposition phenomenon. After describing The training in networks is done by adjusting weight constants the network as a quantum circuit, we analyze the quantum state applied to the input parameters. However, the considered of the circuit-output and show that it relates to a neural net numbers of input parameters and the layers in these net- with a periodic activation function involving the cosine values works increase the computational cost dramatically. Quantum of the weighted sum of the input parameters. We then present computers are believed to be more powerful computational the complexity of the network and then show the numerical machines which may allow to solve many intractable problems simulations for two different data sets. in science and engineering. Although building useful quantum computers with many qubits are the main focus of recent I. QUANTUM NEURAL NET experimental research efforts [1], the complete use of these In classical neural networks, linear combinations of input computers are only possible by novel algorithms that provides parameters with different weights are fed into multiple neu- computational speed-up over classical algorithms. rons. The output of each neuron is determined by an activation Although many early efforts to describe quantum perceptron function such as the following one (see Ref.[24] for a smooth arXiv:1804.07633v3 [quant-ph] 17 Jul 2018 (e.g. [2]) and neural network models (e.g. [3], [4], [5]) and introduction): general discussions on quantum learning [6], [7], research in 0 if wj xj threshold quantum machine learning [8], [9], [10] and quantum big output = j ≤ (1) 1 if P wj xj > threshold data analysis (e.g. [11], [12]) gained momentum in recent j years. Various quantum learning algorithms and subroutines Nonlinear activation functions suchP as hyperbolic and sigmoid are proposed(see the review articles [8], [9], [10] and the functions are more commonly used to make the the output survey [13] on general quantum learning theory): While many of a neuron smoother: i.e. a small change in any weight of the recent algorithms are based on variational quantum causes a small change in the output. It has been also argued circuits[14], [15], [16], [17], [18], some of them employs that periodic activation functions may improve the general quantum algorithms: For instance, Ref.[19] uses Grover search performance of neural nets in certain applications [25], [26], algorithm [20] to extract solution from the state which is [27]. prepared by directly mapping weights and inputs to the qubits. Here, let us first assume that an input parameter xj is The measurement in the output of a layer is used to decide expected to be seen with k number of different weights the inputs to hidden layers. In addition, Ref.[21] has used the wj1,...,wjk in the network. For each input, we will { } construct the following operator to represent the input behavior Here, αj describes the phase value of the jth eigenvalue of of a parameter xj : . After the second Hadamard gate, the final state reads the U iw 1 x following: e j j iwj2xj e N N Uxj = . (2) 1 iαj iαj .. 0 1+ e j + 1 1 e j . (7) 2√N | i | i | i − | i eiwjk xj Xj Xj Since Uxj is a k dimensional matrix, for each input xj , we If we measure the first qubit, the probability of seeing 0 and | i employ log2 k number of qubits. Therefore, n-input parame- 1 , respectively P0 and P1, can be obtained from the above | i ters lead to n number of Uxj and require n log2 k number of equation as: qubits in total: This is depicted by the following circuit: N / Ux1 / 1 iαj 2 1 P0 = 1+ e = (1 + cos(αj )) , (8) 4N | | 2N / Ux2 / Xj Xj N . 1 1 . iαj 2 P1 = 1 e = (1 cos(αj )) . (9) / Uxn / 4N | − | 2N − Xj Xj We can also describe the above circuit by the following tensor (10) product: (ω, x)= Ux2 Ux2 Ux . (3) If a threshold function is applied to the output, then U ⊗ ⊗···⊗ n In matrix form, this is equal to: 0 if P1 P0 n z = (11) i w 1 x ≤ e Pj j j 1 if P1 > P0 i P wj1xj +wn2xn e j Here, applying the measurement a few times, we can also . (4) .. obtain enough statistics for P0 and P1; and therefore describe i P wjk xj z as the success probability of the desired output: i.e., z = P . e j d The whole circuit can be also represented as an equivalent The diagonal elements of the above matrix describe an in- neural net shown in Fig.2. In the figure, f is the activation put with different weight-parameter combinations. Here, each function described by: combination is able to describe a path (or a neuron in the hidden layer) we may have in a neural net. The proposed f(α)=1 cos(α). (12) network with 1-output and n-inputs is constructed by plugging − this matrix into the circuit drawn in Fig.1. ω11 ✌ 0 H H ✌✌ z f(Σ) | i • x1 ωN12 ψ / (ω, x) / f(Σ) | i U z ωN21 Σ Fig. 1: The proposed quantum neural network with 1-output x2 f(Σ) and n-input parameters. ωN22 f(Σ) In the circuit, initializing ψ as an equal superposition state N allows the system qubits to equally| i impact the first qubit which Fig. 2: The equivalent representation of the quantum neural yields the output. In order to understand how this might work net for two input parameters and two weights for each input: as a neural net, we will go through the circuit step by step: i.e. n =2 and k =2. At the beginning, the initial input to the circuit is defined by: N 1 0 ψ = 0 j , (5) A. The Cost Function | i | i √N | i | i Xj We will use the following to describe the cost of the network where N = kn describing the matrix dimension and j is the for one sample: th vector in the standard basis. After applying the Hadamard| i j s gate and the controlled U(ω, x) to the first qubit, the state 1 2 C = (dj zj) , (13) becomes 2s − Xj N N 1 0 j + 1 eiαj j . (6) where dj is the desired output for the jth sample and s is the √ | i | i | i | i 2N Xi Xj size of the training dataset. 2 B. Backpropagation with Gradient Descent IV. DISCUSSION The update rule for the weights is described by the follow- A. Adding Biases ing: Biases can be added to a few different places in Fig.1. As an ∂C ωi = ωi η . (14) example, for input xj , we can apply a gate Ubj with diagonal − ∂wi phases representing biases to Uxj . One can also add a bias Here, the partial derivative can be found via chain rule: For gate to the output qubit before the measurement. instance, from Fig.2 with an input x1, x2 , we can obtain { } B. Generalization to Multiple Output the gradient for the weight ω11 as (the constant coefficients omitted): Different means may be considered to generalize the net- work for multiple outputs. As shown in Fig.3, one can gener- ∂Cj ∂Cj ∂zj ∂α 2 = (dj zj)P x1. (15) alize the network by sequential applications of j s. Here, a j dj U U ∂ω11 ∂zj ∂α ∂ω11 ≈ − represents a generalized multi-qubit phase gate controlled by II.
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages5 Page
-
File Size-