Realizing a Quantum Generative Adversarial Network Using a Programmable Superconducting Processor
Total Page:16
File Type:pdf, Size:1020Kb
Realizing a quantum generative adversarial network using a programmable superconducting processor Kaixuan Huang,1, ∗ Zheng-An Wang,2, ∗ Chao Song,3, ∗ Kai Xu,2 Hekang Li,3 Zhen Wang,3 Qiujiang Guo,3 Zixuan Song,3 Zhi-Bo Liu,1, y Dongning Zheng,2, 4 Dong-Ling Deng,5, z H. Wang,3 Jian-Guo Tian,1 and Heng Fan2, 6, x 1The Key Laboratory of Weak Light Nonlinear Photonics, Ministry of Education, Teda Applied Physics Institute and School of Physics, Nankai University, Tianjin 300457, China 2Institute of Physics, Chinese Academy of Sciences, Beijing 100190, China 3Interdisciplinary Center for Quantum Information, State Key Laboratory of Modern Optical Instrumentation, and Zhejiang Province Key Laboratory of Quantum Technology and Device, Department of Physics, Zhejiang University, Hangzhou 310027, China 4CAS Center for Excellence in Topological Quantum Computation, University of Chinese Academy of Sciences, Beijing 100190, China 5Center for Quantum Information, Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing 100084, China 6School of Physical Sciences, University of Chinese Academy of Sciences, Beijing 100190, China Generative adversarial networks are an emerging technique with wide applications in machine learning [1], which have achieved dramatic success in a number of challenging tasks including image and video generation [2]. When equipped with quantum pro- cessors, their quantum counterparts|called quantum generative adversarial networks (QGANs)|may even exhibit exponential advantages in certain machine learning ap- plications [3{9]. Here, we report an experimental implementation of a QGAN using a programmable superconducting processor, in which both the generator and the dis- criminator are parameterized via layers of single- and multi-qubit quantum gates. The programmed QGAN runs automatically several rounds of adversarial learning with quantum gradients to achieve a Nash equilibrium point, where the generator can repli- cate data samples that mimic the ones from the training set. Our implementation is promising to scale up to noisy intermediate-scale quantum devices, thus paving the way for experimental explorations of quantum advantages in practical applications with near-term quantum technologies. The interplay between quantum physics and machine indeed be trained via the adversarial learning process learning gives rise to an emergent research frontier of to replicate the statistics of the single-qubit quantum quantum machine learning that has attracted tremen- data output from a quantum channel simulator. Yet, in dous attention recently [10{13]. In particular, cer- this experiment the gradient, which is crucial for train- tain carefully-designed quantum algorithms for machine ing the QGAN, was estimated numerically by the finite- learning, or more broadly artificial intelligence, may ex- difference method with a classical computer. This finite- hibit exponential advantages compared to their best pos- difference approach induces an inaccuracy to the gradient sible classical counterparts [3, 11{16]. An intriguing ex- and therefore may significantly retard the convergence ample concerns quantum generative adversarial networks to the equilibrium point [15]. In addition, the quan- (QGANs) [3], where near-term quantum devices have the tum states involved in this experiment were all single- potential to showcase quantum supremacy [17] with real- qubit states, and entanglement, a characterizing feature life practical applications. of quantumness and a vital resource for quantum advan- tages, was absent during the learning process [12]. arXiv:2009.12827v1 [quant-ph] 27 Sep 2020 The general framework of QGANs consists of a gen- erator learning to generate statistics for data mimick- In this paper, we add these two crucial yet miss- ing those of a true data set, and a discriminator trying ing blocks by reporting an experiment realization of a to discriminate generated data from true data [3]. The QGAN based on a programmable superconducting pro- generator and discriminator follow an adversarial learn- cessor with multiple qubits. Superconducting qubits are ing procedure to optimize their strategies alternatively a promising platform for realizing QGANs, owing to their and arrive at a Nash equilibrium point, where the gen- flexible design, excellent scalability and remarkable con- erator learns the underlying statistics of the true data trollability. In our implementation, both the generator and the discriminator can no longer distinguish the dif- and discriminator are composed of multiqubit parameter- ference between the true and generated data. Different ized quantum circuits, also referred to as quantum neu- versions of QGANs have been proposed [3{9]. Recently, ral networks in some contexts [18{21]. Here, we bench- a proof-of-principle experimental QGAN demonstration mark the functionality of the quantum gradient method has been reported [7], showing that the generator can by learning an arbitrary mixed state, where the state is 2 replicated with a fidelity up to 0.999. We further uti- and a quantum process tomography (QPT) is performed lize our QGAN to learn an classical XOR gate, and the to characterize the overlap fidelity between the generated generator is successfully trained to exhibit a truth table data based on the updated parameters and the true data. close to that of the XOR gate. The above mentioned QGAN is experimentally re- We first introduce a general recipe for our QGAN and alized on a superconducting quantum processor using then apply it to two typical scenarios: learning quantum 5 frequency-tunable transmon qubits labeled as Qj for states and the classical XOR gate. The overall structure j = 0 to 4, where all qubits are interconnected by a cen- of the QGAN is outlined in Fig.1 (a), which includes tral bus resonator as illustrated in Fig.1 (b). The role an input label, a real source (R), a generator (G), a dis- arrangement of Q0 to Q4 can be visualized by the exem- criminator (D), and a quantum gradient subroutine. The plary experimental sequence shown in Fig1 (c). Q 0 is input label sorts the training data stored in R, and also to assist the quantum gradient subroutine. Q1-Q2 stores instructs G to generate data samples mimicking R. D re- the label which is passed to D, with the output score D;R/G z ceives the label and corresponding data samples from ei- encoded in Q1 by Sn = hσ1 i=2 + 1=2. For the ex- ther G or R, and then evaluates with appropriate scores, perimental instances with G inserted, Q3-Q4 also stores based on which a loss function V is constructed to dif- the label as the input to G and the output data sample ferentiate between R and G. The adversarial training from G is passed to D via Q3; for the experimental in- procedure is repeated in conjugation with the quantum stances with R in replacement of G, only Q3 stores the gradient subroutine, which yields the partial derivatives data sample from R as designated by the label. Details of V with respect to the parameters constructing D or of the device parameters can be found in supplementary G, so as to maximize V for the optimal configuration of materials (SM) [24] and Ref. [25]. D and to minimize V for the optimal G in terms of the By tuning the qubits on resonance but detuned from chosen values of the constructing parameters. the bus resonator in frequency, these qubits can be all Specifically, denoting the sets of parameters construct- effectively connected, which enables the flexible realiza- ~ ~ ing G and D as θG and θD, respectively, the loss function tions of the multiqubit entangling gates among arbitrar- V is written as ily selected qubits. The all-to-all interactions are de- 1 N h i scribed in the dispersive regime by the effective Hamil- X D;R ~ D;G ~ ~ + − − + + − V = S (θD) − S (θD; θG) ; tonian H = P λ (σ σ + σ σ )[25], where σ (σ ) N n n I jk j k j k j j n=1 denotes the raising (lowering) operator of Qj, and λjk is where N is the total number of data samples selected the effective coupling strength between Qj and Qk medi- D;R D;G for training and Sn (Sn ) represents the score of the ated by the bus resonator. Evolution under this Hamil- nth data sample from R (G) evaluated by D. G and tonian for an interaction time τ leads to the entangling −iHIτ D are trained alternately with D being trained first. In operator with the form of UENT = e , which can D's turn, we maximize the loss function by iteratively steer the interacting qubits into highly entangled state. ~ ~i+1 ~i optimizing θD according to θ = θ +αDr V , where i In our QGAN, the parameterized quantum circuits that D D θ~D is the iteration step index and r V denotes the gradient comprise G and D leverage the naturally available multi- θ~D vector of the loss function at the ith step; we train G qubit UENTs, with the interaction time of the two-qubit by minimizing the square of the loss function with an UENT fixed at around 50 ns for G and the three-qubit ~i+1 ~i 2 iteration relation θ = θ − αGr V . The learning one fixed at around 55 ns for D [24]. We note that, in G G θ~G rates can be adjusted by tuning αD and αG which are this hardware-efficient realization of the QGAN, the en- typically on the order of unity. tangling operators can be any device-tailored operations An important quantity that plays a vital role in train- that generate sufficient entanglement. ing our QGAN is the gradient of the loss function with As laid out in Fig.1 (c), the entangling operators UENT respect to a given parameter. Interestingly, owing to the are interleaved with the single-qubit X and Z rotations, special structures of our quantum circuits for the genera- which successively rotate Qj around x- and z-axis in the x z tor and the discriminator, a common approach to obtain Bloch sphere by angles of θj;l;m and θj;l;m, where l is such a gradient is to shift the corresponding parameter the layer index and m 2 fG; Dg.