Implementation and Testing of Current Quantum Pattern Matching Algorithms Applied to Genomic Sequencing
Total Page:16
File Type:pdf, Size:1020Kb
POLITECNICO DI TORINO Master Degree Course in Computer Engineering Master Degree Thesis Implementation and testing of current quantum pattern matching algorithms applied to genomic sequencing Supervisor Prof. Bartolomeo Montrucchio Candidate Raffaele Palmieri Company Supervisors at LINKS Dr. Olivier Terzo Dr. Andrea Scarabosio Dr. Alberto Scionti Carmine D'Amico academic year 2018/2019 Summary Quantum Computing is a new and mostly unexplored research field that promises to revolutionize the way of computing information for those problems that still need a lot of time and resources for classical computers to resolve them. An example of such problems is Genome Sequencing and Alignment, in which genomes are sequenced by means of sequencers machines, obtaining short reads that are then assembled together to obtain the whole genome. In particular, Grover's algorithm is a quantum search algorithm that could find application in the context of sequence alignment. In this thesis we analyzed the most important and recent Quantum Pattern Matching algorithms developed for Genome Sequencing, all based on the fundamental Grover's algorithm, implementing them in Python 3 by means of the Qiskit library (developed by IBM and giving free access to the IBM Quantum Computing cloud platform). So, we did some testing on the algorithms and did some final considerations about them and on the current and future status of Quantum Computing. 3 Contents 1 Introduction 7 1.1 Scope of the thesis...........................7 1.2 Organisation of the thesis.......................8 2 Why Quantum Computing? 9 2.1 A new way of computing information.................9 2.1.1 Shor's algorithm........................ 10 2.1.2 Grover's algorithm....................... 10 2.1.3 Deutsch-Jozsa algorithm.................... 10 2.1.4 Obstacles that separates us from quantum supremacy.... 11 2.2 Quantum Computing: basic concepts................. 12 2.2.1 Qubits and quantum states.................. 13 2.2.2 Quantum gates and quantum circuits............. 17 2.3 Real quantum computers and quantum simulators................................ 19 2.3.1 Technologies for real quantum computers........... 21 2.3.2 Quantum simulators...................... 22 2.3.3 Quantum Assembly....................... 23 2.4 Tools used................................ 24 3 An overview of Genome Sequencing 26 3.1 An introduction to the problem.................... 26 3.2 DNA Structure............................. 26 3.3 Genome Sequencing........................... 27 3.3.1 Sequencing methods: an historical perspective........ 28 3.3.2 From Sequencing to Reconstruction.............. 29 3.4 DNA Alignment............................. 30 3.4.1 Exact matching methods.................... 31 3.4.2 Approximate matching methods................ 34 3.5 DNA Big Data............................. 36 4 4 Quantum Pattern Matching algorithms 38 4.1 Quantum search applied to Pattern Matching............ 38 4.2 The fundamental Grover's Algorithm................. 38 4.2.1 The Oracle........................... 39 4.2.2 The Algorithm......................... 39 4.3 Quantum Associative Memory..................... 44 4.3.1 Arbitrary amplitude distribution initialization........ 44 4.3.2 The Algorithm......................... 46 4.4 Fast Quantum Search based on Hamming distance........................... 47 4.5 Quantum indexed Bidirectional Associative Memory......... 48 4.5.1 Quantum Associative Memory with Distributed Queries....................... 49 4.5.2 The QiBAM model....................... 52 4.6 Quantum Pattern Recognition..................... 53 5 Implementation and Testing 56 5.1 A brief introduction to the implementation and testing work.... 56 5.1.1 Tools used............................ 56 5.1.2 How testing was done...................... 56 5.1.3 Chosen algorithms....................... 58 5.1.4 Decoherence and number quantum of gates.......... 59 5.2 Quantum Associative Memory..................... 60 5.3 Quantum indexed Bidirectional Associative Memory......... 63 5.4 Quantum Pattern Recognition..................... 67 5.5 A brief summary about analyzed algorithms............. 71 5.6 An example of noisy quantum computation.............. 71 5.6.1 Noise, number of gates and circuit depth........... 74 6 Conclusions and future developments 76 6.1 Present and Future of Quantum Computing............. 76 A Quantum Associative Memory 78 A.1 Python code............................... 78 A.2 Other tests............................... 85 A.3 Verifying the validity of the implementation............. 87 5 B Quantum indexed Bidirectional Associative Memory 89 B.1 Python code............................... 89 B.2 Other tests............................... 97 B.3 Verifying the validity of the implementation............. 99 C Quantum Pattern Recognition 101 C.1 Python code............................... 101 C.2 Other tests............................... 111 Bibliography 115 6 Chapter 1 Introduction 1.1 Scope of the thesis In this thesis we will talk about the basic concepts of Quantum Computing and its application to a specific field, that is Genome Sequencing. Quantum Computing is a new and potentially revolutionary way of computing information, that finds its strongest advantage in the concept of Quantum Parallelism. Thanks to the peculiar characteristics offered by Quantum Computing, a wide variety of problems requiring a lot of time to be resolved (even to the point of becoming practically unapproach- able with classical computation) becomes approachable. Genome Sequencing is one of the fields that could benefit a lot from the application of Quantum Computing: it requires the computation of a very big amount of data coming from genomes belonging to a wide variety of living beings, so it's an important and interesting field for Quantum Computing application (for medical and biological reasons). In order to talk in more depth about Quantum Computing applied to Genome Sequencing, we examined some Quantum Pattern Matching algorithms, choosing the most recent and, in our opinion, promising ones; we implemented these algo- rithms using the Qiskit library in Python 3, developed by IBM for free access to the IBM Quantum Experience cloud platform and giving us the chance for testing these algorithms and doing some considerations about the obtained results. In the end, some final considerations about the current and future status of Quantum Computing field will be given. 7 1 { Introduction 1.2 Organisation of the thesis The thesis is organised in chapters as follows: • in Chapter 2 we will talk about the idea of Quantum Compunting and the companies involved in this mostly unexplored research field; after that, we will give some basic concepts about Quantum Computing; • in Chapter 3 we will talk about the problem of Genome Sequencing, giving an historical perspective and mentioning the most famous classical algorithms for pattern matching used in the Genome Sequencing and Alignment field; • in Chapter 4 we will introduce the chosen Quantum Pattern Matching al- gorithms for implementation and testing, giving some mathematical insights about them; • in Chapter 5 we will introduce the obtained results, doing also an important consideration about noise in Quantum Computing; • in Chapter 6 we will give some final considerations about the future of Quan- tum Computing. 8 Chapter 2 Why Quantum Computing? 2.1 A new way of computing information \Nature isn't classical, dammit, and if you want to make a simulation of nature, you'd better make it quantum mechanical, and by golly it's a wonderful problem, because it doesn't look so easy." With these words, pronounced in 1981 during a conference co-organized by MIT and IBM [1], Richard Feynman encouraged the scientific community to exploit the peculiar characteristics of quantum mechanics in order to give birth to a new way of computing information. Today's computing platforms are called classical com- puters, to distinguish them from quantum computers. The interest towards this unexplored branch of computer science is big and is growing: why is Quantum Computing so fascinating? To answer this question, we need to briefly give some examples of the potentially disruptive performance of a quantum computer, com- paring it to a classical computer. We will also talk about the reasons not to be over-hyped about Quantum Computing: important physical obstacles are yet to be overcame if we want to build powerful quantum computers with a big number of qubits and capable of doing error correction. After that, we will introduce some basic concepts of Quantum Computing. 9 2 { Why Quantum Computing? 2.1.1 Shor's algorithm Shor's algorithm is maybe the best example to give an idea of the potential of Quantum Computing. This algorithm is used to find the prime factorization of a number: given a number, it can find efficiently two prime integers which, when multiplied, give the original number. It's important to remark that many modern cryptography systems (like RSA) rely on the fact that, while computing the product of two very large prime numbers is quick, figuring out which very large prime numbers multiplied together yield to a certain product is time consuming: RSA picks prime numbers so large that it would take more than a quadrillion years on a classical computer to discover them. On the other hand, Shor's algorithm can run on a quantum computer in polynomial time, with O(d3) complexity (where d 1 is the number of decimal digits of the integer to factor), against the O(exp(d 3 )) of the best-known classical algorithm [2].