
Simulating Quantum Computers Using OpenCL Adam Kelly November 9, 2018 Quantum computing is an emerging technol- matrices. ogy, promising a paradigm shift in computing, In this work, a simulator using OpenCL is de- and allowing for speed ups in many different scribed, a technology introduced in section 1.2. problems. However, quantum devices are still in their early stages, most with only a small 1.1 Existing Research number qubits. This places a reliance on sim- ulation to develop quantum algorithms and to The idea of using classical computers to simulate verify these devices. While there exists many quantum computers and quantum mechanics is noth- algorithms for the simulation of quantum cir- ing new. There exists a variety of software libraries cuits, there is (at the time of writing) no tools that can be used to so, each with different purposes. which use OpenCL to parallelize this simula- Some libraries such as QuTIP[16] are aimed at solv- tion, thereby taking advantage of devices such ing a wide variety of quantum mechanical problems, as GPUs while still remaining portable. whereas others are more specialized such as Quipper In this paper, such a tool is described, in- [14] for controlling quantum computers and qHipster cluding optimizations in areas such as gate ap- for simulating quantum computers using distributed plication. This leads to a new approach that computing techniques [20]. A comprehensive list of outperforms other popular state vector based tools is available on Quantiki [2]. simulators. An implementation of the pro- While the area of simulation is well established, posed simulator is available at https://qcgpu. there are, to my knowledge, no simulation tools that github.io. can take advantage of hardware acceleration. It is well known that dedicated hardware can speed up certain types of computations. This is becoming in- 1 Introduction creasingly more apparent in fields such as machine learning, gaming and cryptocurrency mining. Quantum computing is a paradigm shift in comput- While this research mainly looks at state vector ing. These devices are thought to be the key to solving simulations, there are other ways of doing simulations. some types of problems, such as factoring semi-prime These include using the Feynmann path integral for- integers [19], search for elements in an unstructured mulation of quantum mechanics [6,8], using tensor database [15, 23], simulation of quantum systems, op- networks [18] and applying different simulations for timization [13] and chemistry problems. circuits made up of certain types of quantum gates These problems are not feasible to solve using classi- [7]. These techniques (while not covered in this work) cal computers, but quantum computers may fix that. will hopefully be included in the simulation software Still, it is estimated that hundreds [4] up to thousands at a later date. [5] of qubits (the quantum analogue to bits) will be needed. Still, the way that quantum computers oper- 1.2 OpenCL ate does not violate the Church-Turing principle [10]. This means that quantum computers can be, to some OpenCL (Open Computing Language) is a general- arXiv:1805.00988v2 [quant-ph] 7 Nov 2018 extent, simulated using classical computers. purpose framework for heterogeneous parallel com- There are some existing quantum computers, such puting on cross-vendor hardware, such as CPUs, as IBM’s Q Experience [3], a semi-public cloud based GPUs, DSP (digital signal processors) and FPGAs quantum computer with up to 20 qubits. While the (field-programmable gate arrays). It provides an ab- number of qubits available at the moment is small, straction for low-level hardware routing and a consis- as it increases, many issues are being raised. One of tent memory and execution model for dealing with these issues is the ability to assess the correctness, massively-parallel code execution. This allows the performance and scalability of quantum algorithms. framework to scale from embedded systems to hard- It is this issue which simulators of quantum comput- ware from Nvidia, ATI, AMD, Intel and other man- ers address. They allow the user to test quantum ufacturers, all without having to rewrite the source algorithms using a limited number of qubits, and cal- code for various architectures. A more detailed culate measurements, state amplitudes and density overview of OpenCL is given in [22]. 1 queue. The hardware will then load DRAM into the global device RAM, and execute each work group on the work-queue. On the device, the multiprocessor will execute the kernel using multiple threads at once. If there is more work groups than threads on the device, they will be serialized. There are some limitations. The global work size must be a multiple of the work group size. This is to say the work group must fit evenly into the data Figure 1: The OpenCL programming model/architecture structure. Secondly, the number of elements in the n- dimensional vector must be less or equal the The main advantage of using OpenCL over a hard- CL_KERNEL_WORK_GROUP_SIZE flag. This is important ware specific framework is that of a portability first to the QCGPU library as it sets a hard limitation approach. OpenCL has the largest hardware cover- on the size of the state vector being stored on the age, and as a header only library, it requires no spe- GPU. CL_KERNEL_WORK_GROUP_SIZE is a hardware flag, cific tools or other dependencies. Aside from this, and OpenCL will return an error code if either of OpenCL is very well suited to tasks that can be ex- these conditions is violated. This can be avoided by pressed as a program working in parallel over simple using an approach similar to the distributed memory data structures (such as arrays/vectors). The dis- techniques used in other simulations. This feature is advantages with OpenCL, however, come from this planned to be implemented soon. lack of a hardware-specific approach. Using propri- etary frameworks can sometimes be faster than using OpenCL, and sometimes it can also be more straight- 1.3 Quantum Computing forward to develop kernels for the devices. Before considering quantum computing, let’s first OpenCL is an open standard maintained by the start with classical computation. A classical com- non-profit Khronos Group. It views a computing sys- puter is the type of computer that you may have at tem as a number of compute devices (such as CPUs home. Laptops, Tablets, Phones and Smart TV’s are or accelerators such as GPUs), attached to a host pro- all examples of a classical computer. cessor (a CPU). OpenCL executes functions on these A quantum computer is different. It takes advan- devices called Kernels, and these kernels are written tage of principles of quantum mechanics such as su- in a C-like language, OpenCL C. A compute device perposition, entanglement and measurement to per- is made up of several compute units which contain form computation (see the following section). Because multiple processing elements. It is the processing el- of this, it can do computations that normal computers ements that execute kernels. This is shown in figure will never be able to do. 1. At the host level, a compute device is selected. The OpenCL API then uses its platform later to submit 1.3.1 Qubits and State work to the device and manage things like the work In a classical computer, information is represented as distribution and memory. The work is defined using a bit. A bit is a binary system, and thus can be in kernels. These kernels are written in OpenCL C, and one of two states, 0 or 1. In a quantum computer, execute in parallel over a predefined, n-dimensional information is represented as a qubit. The qubit is computation domain. Each independent element of the quantum analogue of a bit. Using Dirac notation this execution is a work item. These are equivalent [11], a qubit can be in the state |0i or |1i, or (more to Nvidia CUDA threads. The groups of work items, importantly) a superposition (linear combination) of work groups, are equivalent to CUDA thread blocks. these states. Mathematically, the state of a single With this, a general pipeline for most GPGPU qubit |ψi is OpenCL applications can be described. First, a CPU |ψi = α |0i + β |1i , (1.1) host defines an n-dimensional computation domain over some region of DRAM memory. Every index of such that α, β ∈ C. The coefficients also must follow this n-dimensional domain will be a work item, and a normalization condition of |α|2 + |β|2 = 1. each work item will execute the same given Kernel. In the above state, the complex numbers α and β The host then defines a grouping of these into work are known as amplitudes. The states |0i and |1i are groups. Each work item in the work-groups will ex- known as basis states. Importantly, given any state ecute concurrently within a compute unit and will |ψi, it is impossible to extract the amplitudes of any share some local memory. These are placed on a work basis state. 2 Commonly used is the vector notation for states. 1.3.2 Manipulating the State The basis states |0i and |1i are vectors that form In a classical computer, bits are manipulated using an orthonormal basis for that qubits state space. logic gates. There is a quantum analogue to this too. The standard representation (and the one followed Just as the state of a system of qubits was defined throughout QCGPU and this paper) is using vectors, the way they change can be described also. The state of a qubit (or multiple qubits) is 1 0 changed by quantum logic gates, or just gates.
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages13 Page
-
File Size-