<<

Research Faculty Summit 2018 Systems | Fueling future disruptions An HPC Systems Guy’s View of Computing

Torsten Hoefler ETH Zurich, Switzerland (Professor) Microsoft Quantum, Redmond (Visiting Researcher) Who is this guy and what is he doing here?

1 professor, 6 scientific staff, 13 PhD students 6.5k staff, 20k students, focus on research

Climate Sim Deep Learning

[1]

Vector DAPP

[1] M. Besta, TH: Slim Fly: A Cost Effective -Diameter Network Topology, IEEE/ACM SC14, best student paper 0 = |1 What is a and how do I get one? 1 “I don't like it, and I'm sorry I ever had anything to do with it.” ⟩ |+ Schrödinger (about the probability interpretation of ) ⟩ 11 0 ==0 = 0 ++1 =|1 + = 1 1 00 1 | = |0 2 2 0 0 1 0 1 ΨΨ 𝛼𝛼𝛼𝛼0 For𝛼𝛼𝛼𝛼1 example:⟩ + =𝛼𝛼 0 +𝛼𝛼 |1 Ψ⟩ ⟩ 1 1 2 2 ⟩ One qubit can include a lot of information in and but can only sample one bit while losing

0 1 (encoding n bits takes operations) 𝛼𝛼 𝛼𝛼 Ω 𝑛𝑛 live in a vector space of 2 complex numbers (all combinations + entanglement) 𝑛𝑛 𝑛𝑛 = | e.g., = 00 + 01 + 10 + |11 .. n 𝑖𝑖 2 0 1 2 3 Ψ �𝑛𝑛 𝛼𝛼 𝑖𝑖⟩ Ψ 𝛼𝛼 𝛼𝛼 𝛼𝛼 𝛼𝛼 ⟩ 𝑖𝑖=0 2 −1 Example: adding 2 numbers in O(log ) time

𝑛𝑛 Quantum Adder final adder state Reminder: Classical Adder (entangled “”) |0101 H 𝑛𝑛 101 … 1 1 … [ ] = (|00 + 01=+ 10 + |11| ) 4 2 ⟩ .. 0 𝑎𝑎 ⟩ 𝑎𝑎 𝑛𝑛 �𝑛𝑛 𝑖𝑖⟩⟩ |00 H 𝑖𝑖=0 2 −1 … 01 010 1 + 1 𝑎𝑎 𝑛𝑛 𝑎𝑎 𝑎𝑎 00 … |0010 H 11| … 0 … 𝑎𝑎 + 1 ⟩ 1 1 0 [ ] = (|00 + 01= + 10 + |11| ) 0 4 2 logarithmic + logarithmic 1𝑠𝑠 .. 𝑎𝑎+𝑏𝑏 |0⟩ 𝑏𝑏 𝑛𝑛 �𝑛𝑛 𝑖𝑖⟩ Ψprint(a,⟩ a+b) 𝑛𝑛−1 𝑏𝑏 11 ⟩ 𝑖𝑖=0 2 −1 ⟩ depth 𝑎𝑎 depth 1 H 11 (“measure”) 𝑏𝑏 𝑛𝑛 𝑏𝑏 𝑎𝑎 𝑏𝑏 𝑛𝑛−1 0 0𝑠𝑠 |0⟩ 010 … linear 1 linear 𝑐𝑐 00 |0 work |

⟩ … 𝑏𝑏 work … 0 𝑐𝑐 ⟩ 𝑛𝑛−1 |0 | ⋅⟩ 𝑏𝑏 2𝑛𝑛 ⟩ ⋅⟩ We add all 2 numbers in parallel but only recover classical bits! 𝑛𝑛 𝑛𝑛 A Corollary to Holevo’s Theorem (1973): at most classical bits can be extracted from a with qubits even though that system requires 2 1 complex numbers to be represented!𝒏𝒏 𝑛𝑛 𝒏𝒏 My corollary:− practical quantum read a linear-size input and modify an exponential-size quantum state such that the correct (polynomial size) output is likely to be measured.

Question: Are quantum algorithms good at solving problems where a solution is verifiable efficiently (polynomial time)? Answer: Kind of  So quantum computers can solve NP- problems!? A problem is in NP if a solution can be verified deterministically in polynomial time.

• Even quantum computers may not solve NP-hard problems (limited by linearity of operators). But since quantum is at least as powerful as classic, we do not know! • New : Bounded-error Quantum Polynomial time (BQP)

NPI – e.g., graph isomorphism factoring, ? NP BQP PSPACE NP-complete P Hardware and software architecture for

300K Logical layer qubits bits abstraction Q# programming >100kW Quantum Instruction language SW circuits stream + data

Classical Logic Q intermediate 77K representation (liquid Gate synthesis Build Gates Factory nitrogen) (T, Rotation, control and Microcoded MW <1kW multi-control, …) qubit routing instructions Control for 4K Q error correction QEC Codes Microcoded (liquid (Steane etc., QEC (varies MW helium) instructions <10W 1:n mapping) with code)

<0.1K Physical control Physical Analog pulse Qubit control (dilution HW refrigeration) quantum generators pulses <0.01W qubits state Full Example: Grover’s search allocate log qubits Q# code ⌈ 2 𝐷𝐷 ⌉ ( ) 𝐷𝐷 𝐶𝐶 → 𝑓𝑓 𝑥𝑥 ↦ • Task: find for which = (invert ( )) • Classical requires𝑥𝑥 ∈ 𝐷𝐷 ( ) queries𝑓𝑓 𝑥𝑥 𝑦𝑦 𝑓𝑓 𝑥𝑥 • Quantum requires𝑂𝑂 𝐷𝐷 queries 1 𝑂𝑂 𝐷𝐷 1 average 2 𝑛𝑛

0 ⋯ ⋯ |0 … 00 |0 … 01 |0 … 10 |0 … 11 ⋯ | ⋯ |1 … 10 |1 … 11

⟩ ⟩ ⟩ ⟩ ⋯ 𝑥𝑥⟩ ⋯ ⟩ ⟩ Quadratic speedup? Grover on a real machine Performance estimates must be understood to be believed (inspired by Donald Knuth’s “An must be seen to be believed”)

1. Query complexity model – how algorithms are developed ( ) • = 2 queries ( = 2 - represented by bits) 𝜋𝜋 𝑛𝑛 𝑛𝑛 2. Express𝑇𝑇 4 (oracle and𝐷𝐷 diffusion ) as𝑛𝑛 n-bit unitary • Assuming n-bit operations for oracle! 𝑓𝑓 𝑥𝑥 • = 2 = 2 𝑂𝑂 n-bit operations - 𝜋𝜋 𝑛𝑛 𝜋𝜋 𝑛𝑛 3. Decompose𝑇𝑇 𝑂𝑂 4 unitary into two𝑇𝑇-𝑡𝑡bit (+arbitrary4 rotation) gates • = 2 2( 1) operations - = 2 4( 1) 𝜋𝜋 𝑛𝑛 𝜋𝜋 𝑛𝑛 4. Design𝑇𝑇 𝑂𝑂2 approximate4 ⋅ 𝑛𝑛 − implementations in discrete𝑇𝑇𝑡𝑡 4 gate⋅ set𝑛𝑛 −(using HTHT...) • = 2 2( 1) discrete T gate operations - = 2 48( 1) 𝜋𝜋 𝑛𝑛 𝜋𝜋 𝑛𝑛 � 5. Mapping𝑇𝑇 𝑂𝑂2 4 to real⋅ 𝑛𝑛hardware− (swaps and teleport)𝑇𝑇𝑡𝑡 4 ⋅ 𝑛𝑛 − • Not to simple to model, depends on oracle – potentially 2 slowdown 6. 𝑛𝑛 Θ • Not so simple, depends on quality of physical bits and circuit depth, huge constant slowdown Quadratic speedup? Grover on a real machine Performance estimates must be understood to be believed (inspired by Donald Knuth’s “An algorithm must be seen to be believed”)

1. Query complexity model – how algorithms are developed age (of the) universe (13.772 billion years) • = 2 queries ( = 2 - represented by bits) Quantum𝜋𝜋 computer with logical error rates 10 100 years 2. Express (oracle𝑛𝑛 and diffusion𝑛𝑛 operator) as n-bit unitary and𝑇𝑇 gate4 times of 10𝐷𝐷 s vs. classical at 1 𝑛𝑛teraop/s.−24 • Assuming n-bit operations for oracle! 𝑓𝑓1 year𝑥𝑥 −6 ≤ • = 2 n-bit operations - = 2 𝜋𝜋 𝑂𝑂 𝜋𝜋

𝑛𝑛 𝑛𝑛 seconds 3. Decompose𝑇𝑇 𝑂𝑂 4 unitary into38 two billion𝑇𝑇-𝑡𝑡bit (+arbitrary4years rotation) gates • = 2 2( 1) elementary operations - = 2 4( 1) 𝜋𝜋 𝑛𝑛 𝜋𝜋 𝑛𝑛 4. Design𝑇𝑇 𝑂𝑂2 approximate4 ⋅ 𝑛𝑛 − implementations in discrete𝑇𝑇𝑡𝑡 4 gate⋅ set𝑛𝑛 −(using HTHT...) • = 2 2( 1) discrete T gate operations - = 2 48( 1) 𝜋𝜋 𝑛𝑛 𝜋𝜋 𝑛𝑛 � 5. Mapping𝑇𝑇 𝑂𝑂2 4 to real⋅ 𝑛𝑛hardware− (swaps and teleport)𝑇𝑇𝑡𝑡 4 ⋅ 𝑛𝑛 − • Not to simple to model, depends on oracle – probably 2 slowdown search bits 6. Quantum error correction 𝑛𝑛 Θ • Not so simple, depends on quality of physical bits and circuit depth, huge constant slowdown from Grassl et al.: “Applying Grover's algorithm to AES: quantum resource estimates”, arXiv:1512.04965 Real applications? Quantum Quantum Chemistry/Physics

. Original idea by Feynman – use quantum effects to evaluate quantum effects . Design catalysts, exotic materials, …

Breaking encryption & bitcoin

. Big hype – destructive impact – single-shot (but big) business case . Not trivial (requires arithmetic) but possible Your connection is not private

Accelerating heuristical solvers . Quadratic speedup can be very powerful! . Requires much more detailed resource analysis  systems problem

Quantum machine learning

. Feynman may argue: “quantum advantage” assumes that circuits cannot be simulated classically  they represent very complex functions that could be of use in ML? Quantum me on the rocky Thanks! path to develop my intuition for • Special thanks to Matthias Troyer and Doug Carmean quantum computation

• Thanks to: Thomas Haener, Damian Steiger, Martin Roetteler, Nathan Wiebe, Mike Upton, Bettina Heim, Vadym Kliuchnikov, Jeongwan Haah, Dave Wecker, Krysta Svore

• And the whole MSFT Quantum / QuArC team!

All used images belong to the respective owners – source list in appendix How does a quantum computer work?

Quantum systems are most Qubits are arranged on a naturally seen as accelerators Quantum circuits use predication (commonly 2D) substrate (no control flow) Work in close cooperation with a Reuse big parts of process traditional control circuit Circuit view simplifies reasoning technology in microelectronics but requires classical envelope

Qubits are error prone, need to be Commonly limited to neighbor highly isolated (major challenge) interactions between qubits Quantum error correction enabled Limited range, may require the dream of quantum computers swapping across chip

Operations (“gates”) are applied Operations (“gates”) have highly to qubits in place! varying complexity As opposed to bits flowing Some are literally free (classical through traditional computers! tracking), some are very expensive Backup For the unexpected discussions Physical qubit implementations

• Photons • Polarization encoding, number of photons (Fock state), arrival timing • , charge (number), • Nuclei • Spin through NMR • Josephson junctions/superconducting • Charge (incl. ), flux, phase • Quantum dots • Spin, localization • Majorana • MZMs Elements of a quantum computation

• Classical control flow • Execute outer loop and classical parts of programs (W=?, D=?) • Gate application, Measurement, Initialization • (W=1, D=1), T gates (W=?, D=?), Measurement (W=?, D=?), Initialization (W=?, D=1) • Qubit control (depends on technology, may be complex as well) • Error correction • Surface code (W=?, D=?, distribution?), (W=?, D=?, distribution) • Input-specific recompilation • (inputs may change during execution due to measurement results, maybe precompile, check algorithms) Basic components for a universal quantum computer

• State preparation (usually |0>) • Gate application (a universal set is H, CNOT, pi/8 phase rotation) • Measurement (sufficient in standard basis |0>, |1>) • Wait (apply identity, to sync with operations on other qubits)

• All needs to be performed fault-tolerant! Thank you!