The University of Chicago Design, Optimization, And

THE UNIVERSITY OF CHICAGO DESIGN, OPTIMIZATION, AND SIMULATION OF SCALABLE QUANTUM COMPUTING SYSTEMS A DISSERTATION SUBMITTED TO THE FACULTY OF THE DIVISION OF THE PHYSICAL SCIENCES IN CANDIDACY FOR THE DEGREE OF DOCTOR OF PHILOSOPHY DEPARTMENT OF COMPUTER SCIENCE BY XIN-CHUAN WU CHICAGO, ILLINOIS MARCH 2021 Copyright © 2021 by Xin-Chuan Wu All Rights Reserved TABLE OF CONTENTS LIST OF FIGURES . vi LIST OF TABLES . ix ACKNOWLEDGMENTS . x ABSTRACT . xi 1 INTRODUCTION . 1 2 THESIS STATEMENT . 3 3 ARTISAN: SCALABLE QUANTUM CIRCUIT OPTIMIZATION USING HIERAR- CHICAL SYNTHESIS . 4 3.1 Introduction . .4 3.2 Background . .7 3.2.1 Principles of Quantum Computation . .8 3.2.2 Quantum Circuit Synthesis . .8 3.3 Scalable Circuit Optimization using Synthesis . .9 3.3.1 Artisan Overview . 10 3.3.2 Physical Qubit Mapping . 11 3.3.3 Circuit Partitioning . 11 3.3.4 Quantum Circuit Synthesis . 15 3.3.5 Circuit Composition . 16 3.4 Experimental Setup . 16 3.4.1 Benchmarks . 17 3.4.2 Experimental Parameters . 18 3.5 Evaluation . 18 3.5.1 CNOT Reduction . 20 3.5.2 Impact of Synthesis Distance . 20 3.5.3 Running on Real Hardware . 22 3.5.4 Running on Noise Simulation . 22 3.5.5 Scalability . 24 3.5.6 Discussion . 26 3.6 Related Work . 27 3.7 Conclusion . 29 4 TILT: ACHIEVING HIGHER FIDELITY ON A TRAPPED-ION LINEAR-TAPE QUANTUM COMPUTING ARCHITECTURE . 30 4.1 Background . 35 4.1.1 Principles of Quantum Computation . 35 4.1.2 Trapped-Ion QC Systems . 35 4.2 TILT Architecture . 37 iii 4.2.1 Feasibility . 38 4.2.2 Gate Selection . 38 4.2.3 Compared with QCCD . 39 4.2.4 TILT Challenges . 40 4.3 LinQ for TILT Architecture . 41 4.3.1 LinQ Overview . 41 4.3.2 Native Gate Decomposition . 43 4.3.3 Qubit Mapping and Swap Insertion . 43 4.3.4 Tape Movement Scheduling . 46 4.3.5 TILT Simulation . 48 4.4 Experimental Setup . 49 4.4.1 Benchmarks . 49 4.4.2 Simulation Parameters . 50 4.5 Evaluation . 51 4.5.1 LinQ Swap Insertion Performance . 51 4.5.2 Architecture Comparison . 54 4.5.3 Compilation Time and Execution Time . 55 4.6 Trapped-Ion Scaling . 57 4.7 Related Work . 58 4.8 Conclusion . 60 5 FULL-STATE QUANTUM CIRCUIT SIMULATION BY USING DATA COMPRES- SION . 61 5.1 Background and Related Work . 64 5.1.1 Principles of Quantum Computation . 64 5.1.2 Quantum Circuit Simulation . 65 5.1.3 Data Compression Techniques . 67 5.2 Simulation Design . 69 5.2.1 Overview of Our Simulation Flow . 70 5.2.2 MCDRAM Memory Configuration . 71 5.2.3 Integration Details . 72 5.2.4 Compressed Block Cache . 73 5.2.5 Simulation Checkpoint . 74 5.2.6 MPI Configuration . 74 5.2.7 Variable Error Bound Compression . 74 5.2.8 Lower Bounds on Simulation Accuracy . 75 5.3 Adaptive Compression Optimized for Quantum Circuit Simulation . 76 5.3.1 Assessment of Existing Error-Bounded Lossy Compressors . 77 5.3.2 Optimizing the Compression Strategy . 79 5.4 Evaluation . 85 5.4.1 Experimental Setup . 85 5.4.2 Scalability . 86 5.4.3 Benchmarks . 86 5.4.4 Experimental Results . 88 5.4.5 Discussion . 89 iv 5.5 Conclusion . 90 6 FUTURE WORK . 95 6.1 Quantum Circuit Optimization using Synthesis . 95 6.1.1 Approximate Synthesis . 95 6.1.2 Pulse-Level Optimization . 95 6.1.3 Artisan in Parallel . 96 6.2 TILT Architecture Scaling . 96 6.2.1 Sympathetic Cooling . 96 6.2.2 QCCD Architectures . 97 6.2.3 Photonic Interconnects . 97 6.3 Quantum Circuit Simulation by using Lossy Compression . 98 6.3.1 Noise Simulation . 98 7 CONCLUSION . 99 A CURRICULUM VITAE . 101 REFERENCES . 105 v LIST OF FIGURES 3.1 An example of Artisan circuit optimization. (a) The original 5-qubit circuit. (b)The original circuit is partitioned into three blocks, and each block only con- tains gates on 3 qubits. The first circuit block is only on q0, q1, and q4. The second block is on q1, q2, and q4. The third block is on q0, q1, and q3. All gates after partitioning still respect the gate dependency in the original circuit. (c) The synthesized circuit. After running quantum synthesis for each block, the CNOT counts in the first and second block are reduced, and the third block still has 2 CNOTs. (d) The single-qubit gates can be combined to produce the final optimized circuit. This circuit is effectively equivalent to the original circuit but with fewer CNOT gates. .5 3.2 Our compilation framework for scalable circuit optimization using synthesis (Ar- tisan). .6 3.3 Example of valid and invalid qubit groups. 13 3.4 An example of gate dependency graph and executable gates on a qubit group. 14 3.5 Experimental platform used in our evaluation. 18 3.6 The numbers of CNOTs are reduced in the Artisan-optimized circuits. 19 3.7 Average CNOT count in a block. 19 3.8 Compilation time. 21 3.9 State infidelity due to the synthesis distance. Compared with the gate error on NISQ devices, the state infidelity due to synthesis distance is negligible. 21 3.10 Running optimized small circuits on IBM's Athens, a 5-qubit device. (Lower dTV is better.) . 23 3.11 Correlation between CNOT reduction rate and dTV improvement measured on IBM's Athens. Each data point is a benchmark from the small set. The correlation coefficient is 78%. The results show that our CNOT reductions are indicative of fidelity savings. 23 3.12 Running optimized small circuits on Qiskit noise simulation. (Lower dTV is better.) 24 3.13 Correlation between dTV.

Load more