Tensor Network Benchmarking for Quantum Computing Eugene Dumitrescu
Collaborators: Alex McCaskey, Dmitry Liakh, Travis Humble, Raphael Pooser, Pavel Lougovski
ORNL is managed by UT-Battelle This work is supported by the DOE Quantum for the US Department of Energy Testbed Pathfinder and ORNL LDRD projects. Goals for this talk:
1. Outline capabilities of a near term quantum devices 2. Discuss potential first programs: • “Find”, realize, and sample states from quantum physics — type of quantum simulation 3. How large scale classical computations validate, benchmark, and improve quantum hardware/ software
2 Quantum Computing Institute What is (quantum) information?
• Physical state of the systems is the set of position and momenta of all constituent particles. • A vector in “phase space”.
dq @H µ = dt @pµ dp @H µ = dt @qµ
An n bits is described by…n bits ˆ H (r)=E (r) • State described by complex valued wavefunction. • Superposition of ‘classical’ states leads to entanglement and richer computational space
Babadi, Demler, Knap PRX (2015) = c i ,i ,...i | i i1,i2,...in | 1 2 ni ~iX=0,1 R | i L 3 Quantum Computing Institute | i State space: Quantum Logic
Information 0 0 physically 1 | i | i encoded | i QuantumQubit bit = 0 | i | i = 1 U1| i | i = ↵ 0 + 1 U2U1| i | i | i 1 | i
q | 1i U1 U2 q2 4 | i 3 U q3 U | i time
4 Quantum Computing Institute The Quantum Advantage
• QFT P. Shor • Factoring/Cryptography
• Phase • Partition Functions (Sampling) • Discrete Optimization Estimation • Machine Learning/AI
• Grover search
• Linear algebra
• Materials Science • Quantum • Chemistry • Biological System simulation • High-energy Physics • Linear Systems (PDEs)
5 Quantum Computing Institute Qubits : 2017 and beyond Rigetti IBM Google Limitations: ~ 50 qubits ~ 50 gates
Ritter 3:30 PM
Delft, NL Future: ~ 500 qubits ~ 500 gates
Near term applications? Can realize many states Could experimental (qubit) supported by 50 qubit realizations find applications in system. quantum systems? ==> quantum supremacy => IBM, Google: Chemistry, ML
6 Quantum Computing Institute => ORNL: Nuclear, Cond-matt Resources: a closer look
new 20/50 qubit devices Cons Pros • Locality constraints • High dimensional state space • Temporal (noise) • Fast operation constraints • Computationally supreme?
7 Quantum Computing Institute HPC with QPU ORNL Testbed Pathfinder PI: R. Pooser 2PM Tomorrow
+
8 Quantum Computing Institute Our solution - XACC Specification
Treat near-term QPUs asQuantum accelerators withinaccelerator a larger pXACCrogramming Specification XACC - Heterogeneous HPC environment. How do we program this? Quantum Intermediate Representation CPU-QPU https://github.com/ORNL-QCI/xacc Programming Model We have provided a solution - (QIR) Specification Quantum Programming XACC Specification: Key insight: Provide common Implementations • Familiar API and Landscape with XACC QMB representation to map N QPLs to N QPUs Abstraction for single QPU instructions Programming model Hadamard Scaffold ProjectQ pyQuil ... QPL-N Scaffold ProjectQ pyQuil ... QPL-N • OpenCL-like - high-level QuantumInstruction kernel compilation and execution API Compiler Frontend CNOT • LLVM-like - language and XACC - Heterogeneous Abstraction for composition of hardware agnostic Quantum Intermediate QPU instructions CPU-QPU through a well designed Representation qfoo1() Programming Model intermediate QuantumFunction representation Backend Generator qfoo2() CNOT • Program quantum code once, X H in your language, and XACC IBM Google Rigetti QPU-N QPU ... 1 +1 handles the rest. QPU QPU Takes QIR and performs IBM Google Rigetti 0 1 +1 QPU-N QPU QPU QPU ... hardware independent and Simulator U(✓) 1 dependent transformations. 1 +1 0 3 billings7893Enabling Quantum Acceleration in Scientific High Performance Computing - ExaTensor Midterm Review 9 Quantum Computing Institute 7 billings7893Enabling Quantum Acceleration in Scientific High Performance Computing - Midterm Review Near Term Algorithms and Programs
10 Quantum Computing Institute Pros lead to supremacy
A quantum supreme device: Applications:
• Performing a classically intractable • A different kind of characterization computation tool compared to quantum state • Does NOT mean outperform tomography. classical computers at ANY other • Randomized benchmarking 2.0? computations • Exponentially large Hilbert space • ‘Supreme’ in a very limited sense used to • Supremacy hurdle must be • Encode strongly correlated overcome before for some quantum mechanical states exponential advantages to be • Large dimensional ‘fitting’ (a la realized (e.g. Shor) neural networks)
11 Quantum Computing Institute Supremacy applied: Quantum Many Body
Hˆ (r)=E (r)
• Matrix (vector) representation of quantum modes and dynamics scales exponentially • Number of Hamiltonian terms grows polynomially (N4, N8) • Use QPU to represent quantum objects as needed! 1 Hˆ = h cˆ† cˆ + h cˆ† cˆ†cˆ cˆ pq p q 2 pqrs p q r s pq pqrs X X Quantum circuit = trial wavefunction: (✓) = U(✓) HF | i | i
i ij T (✓)= t c†c + t c†c†c c + (T (✓)) j i j kl i j k l O 3 T (✓) T †(✓) i virt i>j virt U(✓) e j2Xocc k>lX2 occ 2 2 ⌘ Minimize Energy/ Objective: E(✓) = (✓) h (✓) h i h | i| i i X 12 Quantum Computing Institute Hybrid computation and variational methods No coherent feedback No quantum error correction Quantum state preparation (✓) = U(✓) | i | 0i … convergence?
Shen 2017 (HeH+ ion trap) 1 +1
update ✓ classical computation/post-processing
Peruzzo, McClean 2014 13 Quantum Computing Institute Example Chemistry Code using XACC
github.com/QISKit/
Source https://github.com/ORNL-QCI/xacc-vqe
14 Quantum Computing Institute Metrics for experimental performance
O’Malley (Google) PRX 2016
Kandala (IBM) Nature 2017
• Total (classical + quantum) runtime is metric for successful computation. • I.e. determines energy/properties to within a pre- determined tolerance. • Keep time constant and study precision scaling (future work) Classical • Simulate performance of noiseless algorithm Hybrid Total • Sets bound lower bound on runtime via gate count/circuit depth runtime • Noisy simulations — generalize runtime to include classical computations with additional classical post-processing. • Execute program on hardware. • Additional post-processing may be required system size • Computational results within Bayesian framework?
15 Quantum Computing Institute Enhanced post processing — partial tomography 0 | i = ↵ 0 + 1 ↵2 2 | i | i | i | | | | = ↵0 + 0 + ⇢ | i | i | i
= ↵00 +i + 00 i | i | i | i 1 | i
16 Quantum Computing Institute HPC Tensor Network Simulation: Quantum Validation and Benchmarking
-Dirac 1929
• Exponential quantum state space exposes computational intractability of brute force simulations • classical computation limited to <50 qubits • Compression methods!
R. Orus ‘15
17 Quantum Computing Institute Compressed quantum spaces
Entanglement structure ~ complexity of state representation
= ci1,i2,...in i1,i2,...in 0 0 0 | i | i | i | i | i ~iX=0,1 vs. general entangled states
1 1 1 | i | i | i
Q: Way to systematically describe entanglement structure? A: Schmidt Decomposition across a bipartition
= i i i 1 | i | i | i 0 = 1 = ij p X 2
18 Quantum Computing Institute Many-body TN states
= c i ,i ,...i | i i1,i2,...in | 1 2 ni ~iX=0,1 • Compression for multi-linear mapping (i.e. tensor) • Auxiliary spaces (small) encode correlations. Gauge freedom. • Come from Renormalization group. • Like convolutional neural network for entanglement
DMRG White 92 MPS Tree MERA Vidal 09’ PRL 19 Quantum Computing Institute Tensor Network Implementation: Look to future QPU Algorithms
SVD replaced by GPU tensor optimization https://github.com/DmitryLyakh/TAL_SH Single node - multi CPU https://github.com/g1257/merapp Single node - GPU
Distributed TN CPU Distributed TN GPU
May 2018
Source https://github.com/ORNL-QCI/tnqvm
20 Quantum Computing Institute Simulation Scalability
Device Notes
18,688 AMD Opteron 16-core CPUs ~9200 POWER9 44-core CPUs Specs 32 GB Ram 18,688 Nvidia Tesla K20X GPUs ~27,600 Nvidia Tesla V100 GPUs i7 CPU 10 PFLOPS 150–300 PFLOPS
Scales with ~45qubits Brute force ~25 qubits ~42 qubits total RAM 1/2 hard to ~90 qubits Tree Shor 39 qubits ~80 qubits simulate Classical part MPS VQE >>100 qubits >>200 qubits >>250 qubits more difficult
PEPS MPS 1D: 50 qubits 1D: 90 qubits 1D: 84 qubits algorithm Supremacy 2D: 25 qubits 2D: 45 qubits 2D: 42 qubits better for 2D
Performance MERA 1D: Unbounded 1D: Unbounded 1D: Unbounded unknown. Supremacy 2D: ??? 2D: >50? 2D: >60 Algorithm under development. MERA++: G. Alvarez, D. Liakh
Quantum Computing Institute Dual Register System Simulation
r 1 (22l 1)/r 1 d e = jr + i xi mod N . | i pr 0 | i1 ⌦ | i i=0 j=0 X X @ A
Choose best tensor network algorithms to simulate quantum computations (different from states of matter)
(a) CU8 (b) (c) + | i4 4 4 =2 ...
3
3 3 3
1 2 1 2 1 2 2
22 Quantum Computing(d Institute) (e) Dumitrescu (2017) 1
3 4
1 2 1 2 3 4 Simulations Discretized circuit UCC (Trotterization) Simulated VQE
• Study complexity of VQE state preparation circuits. • Exploit symmetries and structure to enable realizable quantum programs • Verified larger molecules VQE energetics with UCC. • Looking for realistic solutions. • Digitizing IBM-style interlaced entangler/ non-local gate ansatz
Better quantum circuit found using tnqvm!
23 Quantum Computing Institute Cost of VQE • Fundamental building block is parametrized T d(100ns) + 2µs quantum circuit circuit ⇡ • Perform ensemble averaging (determines variance to leading order) over circuit for each set of non-commuting observables N N N sampling ⇡ ensemble ⇤ terms • Classical optimization until convergence (or fixed N 4 time) ⇠ orbitals • get linear circuit depth by increasing classical complexity Nupdates • scales poorly with number of classical parameters (don’t want to get stuck doing classical optimization)
• pre-sampling to generate Jacobian, measure all Hessians N exp(L) overhead ⇠ observables • Error mitigation/tomography/analysis overhead L simplified error model ⇠ sacrifice accuracy • Find ‘sweet spot’ T = T N (N + N ) total circuit ⇤ ensemble ⇤ terms overhead • Communication/Latency cost?
24 Quantum Computing Institute Conclusions
• Introduced modern QPU • Good at navigating Hilbert space, preparing (classically intractable amplitude) probability distribution • Look for ‘important’ regions which encode relevant problem • Hybrid-algorithms get the most out of short depth circuits by augmenting quantum hardware with classical resources • Relevant to quantum systems (Chemistry, Cond-matt, Nuclear, etc) • Simulations vs. Execution • Computational complexity function of classical and quantum algorithms • Classical overhead improves accuracy (but avoid exponentially scaling correction schemes!) • TN algorithms enable simulation of today’s hybrid programs, provide excellent tool to evaluate future programs, debug etc. • Quantum programs should become intractable (event for TN algorithms) in future. Benchmarking before this stage gives us confidence in their performance.
25 Quantum Computing Institute