Quantum Gradient Estimation and Its Application to Quantum Reinforcement Learning

Quantum gradient estimation and its application to quantum reinforcement learning A.J. Cornelissen August 21, 2018 Master thesis Student number: 4322231 Master's program: Applied Mathematics Specialization: Analysis To be defended on: September 4th, 2018 Assessment committee: Prof. Dr. R.M. de Wolf Prof. Dr. J.M.A.M. van Neerven Dr. M.P.T. Caspers Research institute: Delft University of Technology in collaboration with Centrum Wiskunde & Informatica Faculty: Electrical Engineering, Mathematics and Computer Science Contact: [email protected] Abstract In 2005, Jordan showed how to estimate the gradient of a real-valued function with a high-dimensional domain on a quantum computer. Subsequently, in 2017, it was shown by Gilyénet al. how to do this with a different input model. They also proved optimality of their algorithm for `1-approximations of functions satisfying some smoothness conditions. In this text, we expand the ideas of Gilyénet al., and extend their algorithm such that functions with fewer regularity constraints can be used as input. Moreover, we show that their algorithm is essentially optimal in the query complexity to the phase oracle even for classes of functions that have more stringent smoothness conditions. Finally, we also prove that their algorithm is optimal for approximating gradients with respect to general `p-norms, where p 2 [1; 1]. Furthermore, we investigate how Gilyénet al.'s algorithm can be used to do reinforcement learning on a quantum computer. We elaborate on Montanaro's ideas for quantum Monte-Carlo simulation, and show how they can be used to implement quantum value estimation of Markov reward processes. We also show essential optimality of this algorithm in the query complexity of all its oracles. Next, we show how we can construct a quantum policy evaluation algorithm, and how we can use these algorithms as subroutines in Gilyénet al.'s quantum gradient estimation algorithm to perform quantum policy optimization. The most important open questions remain whether it is possible to improve the query complexity of the extension of Gilyénet al.'s algorithm, when function classes containing functions of Gevrey-type 1 are used as input, as at the moment for this specific parameter setting the algorithm is not better than a very simple classical gradient estimation procedure. Improvement of this result would automatically improve the quantum policy optimization routine as well. i Preface This thesis report is the result of the final project that was part of my master program Applied Mathematics at the Delft University of Technology. The research was done in close collaboration with Centrum Wiskunde & Informatica in Amsterdam, from January to August 2018. For me, this project was the first real experience with conducting research along the frontiers of the current knowledge in the field of quantum computing. This enthused me a lot, and I hope to continue to try pushing the boundaries of our common knowledge in the years to come. I owe a debt of gratitude to a lot of people who helped me through this project. First and foremost, I would like to thank Ronald de Wolf, without whom I probably would have never found these interesting problems that lie at the interface between mathematics and quantum computing. I would like to thank him for taking the time and energy to supervise this project, and regularly providing helpful insights and discussions. Secondly, I would also like to thank Martijn Caspers, who throughout the project was always willing to help me out, and arranged the possibility to present my findings to all that were interested. Simultaneously, I would like to thank Ronald de Wolf, Martijn Caspers and Jan van Neerven for taking the time to take part in my assessment committee. Furthermore, I would like to thank AndrásGilyénfor his insightful research paper [GAW17] that I was able to learn from, and for expressing recognition for my work by referencing this master thesis in his paper. Additionally, I would like to thank my parents and sister, for supporting me not only over the course of the last few months, but also throughout the rest of my studies. Finally, I would like to thank all my fellow students who showed interest in the research that I was conducting. Especially, I would like to thank Erik Meulman for the many useful discussions we have had over the course of these last nine months, and for inspiring me to think about reinforcement learning in a quantum computing setting. Arjan Cornelissen, August 20th, 2018 ii Contents Abstract i Preface ii 1 Introduction 1 2 Introduction to quantum mechanics 2 2.1 Mathematical background . .2 2.1.1 Notation . .2 2.1.2 Hilbert spaces . .3 2.1.3 Tensor products . .6 2.2 The postulates of quantum mechanics . 11 2.3 Projective measurements . 18 3 Quantum computing 21 3.1 Qubits . 21 3.1.1 Single-qubit systems . 21 3.1.2 Multiple-qubit systems . 24 3.2 Quantum gates . 27 3.2.1 Single-qubit gates . 28 3.2.2 Multiple-qubit gates . 30 3.3 Quantum circuits . 33 3.4 Quantum algorithms . 37 3.5 Examples of quantum circuits and quantum algorithms . 39 3.5.1 SWAP . 39 3.5.2 Toffoli gate . 41 3.5.3 Quantum Fourier transform . 42 3.5.4 Quantum Fourier adder . 46 3.5.5 Phase estimation . 48 3.5.6 Amplitude amplification . 53 3.5.7 Amplitude estimation . 59 4 Quantum gradient estimation 63 4.1 Nomenclature . 63 4.1.1 Derivatives . 63 4.1.2 Gevrey function classes . 64 4.1.3 Phase oracle queries . 66 4.1.4 Quantum gradient estimation algorithms . 67 4.2 Fractional phase queries . 67 4.2.1 Block-encodings . 68 4.2.2 Implementation of a (1; 1; 0)-block-encoding of sin(f)G .................. 69 4.2.3 Approximation of the function exp(it arcsin(x))...................... 70 4.2.4 Implementation of block-encodings of polynomials of arbitrary operators . 75 4.2.5 Addition of real and complex parts . 81 4.2.6 Quantum circuit of the fractional phase query . 81 4.3 Gilyénet al.'s quantum gradient estimation algorithm . 83 4.3.1 Grid . 84 4.3.2 Numerical method . 84 4.3.3 Algorithm . 90 iii 5 Optimality of Gilyénet al.'s quantum gradient estimation algorithm 97 5.1 Lower bound of specific cases . 97 5.2 Lower bound of more general cases . 104 5.3 Essential optimality of Gilyénet al.'s algorithm and further research . 106 6 Quantum reinforcement learning 108 6.1 Introduction to reinforcement learning . 108 6.1.1 State spaces, action spaces and rewards . 109 6.1.2 Markov processes . 111 6.2 Quantum value evaluation . 114 6.2.1 Classical Monte-Carlo methods . 115 6.2.2 Quantum speed-ups . 117 6.2.3 Essential optimality of query complexity . 127 6.3 Quantum policy evaluation . 130 6.4 Quantum policy optimization . 133 6.4.1 The class of functions of Gevrey-type 1 is closed under composition . 133 6.4.2 Smoothness properties of the policy evaluation function of a Markov decision process . 136 6.4.3 Quantum algorithm for quantum policy optimization . 141 6.4.4 Applications . 152 7 Conclusion 155 Bibliography 156 A Mathematical background of tensor products of Hilbert spaces 158 B Error-propagation lemmas 166 C Hybrid method 168 C.1 Principle of deferred measurement . 168 C.2 Hybrid method . 171 iv 1 Introduction Over the past few decades, the world has undergone dramatic changes as it entered the period collectively referred to as the information age. Machines are taking over more and more tasks that previously had to be performed by hand, sharing information all over the globe is becoming easier and less costly every year, and ever more computational power becomes available to the masses at ever decreasing costs. All of these developments were made possible by major advances on the microscopic level, as computer processor manufacturers have been able to develop smaller and smaller chips, capable of performing more and more operations per second. This development is reaching its physical limit, though. As computer chip manufacturers are creating transistors that measure just a few nanometers across, they face new problems that stem from the laws of quantum mechanics. Most notably, quantum effects like quantum tunneling prevent the development of transistors that are much smaller than the ones that are being created today. So, to keep up with the demands of society, researchers are faced with finding new ways to improve the existing technology. One of the most radical ideas in this respect is to not regard these quantum phenomena as problematic, but to try to utilize them to one's advantage instead. The research field collectively referred to as quantum computing is the field that is concerned with these ideas. Specifically, it investigates the development of a device referred to as a quantum computer, capable of harnessing the quantum mechanical effects to perform computations. In this text, we provide the reader with an elementary introduction into quantum mechanics, in Chapter 2, and subsequently we show how the fundamental laws of quantum mechanics constitute the basic building blocks of quantum computing, in Chapter 3. We also elaborate on some commonly used techniques in this field there. A very prominent question in the field of quantum computing is how computations that can be done on a normal computer, can be sped up using techniques that are based on quantum mechanics. A comprehensive list, known as the quantum algorithm zoo, has been compiled by Stephen Jordan [Jor]. It features some of the well-known quantum algorithms that achieve a significant speed-up over classical algorithms, most notably Shor's algorithm and Grover's algorithm, as well as some lesser known problems that can be solved faster on a quantum computer than on a classical computer.

Quantum Gradient Estimation and Its Application to Quantum Reinforcement Learning

Details

Download

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

Support