Why and when pausing is beneficial in

Huo Chen1, 2 and Daniel A. Lidar1, 2, 3, 4 1Department of Electrical Engineering, University of Southern California, Los Angeles, California 90089, USA 2Center for Quantum Information Science & Technology, University of Southern California, Los Angeles, California 90089, USA 3Department of Chemistry, University of Southern California, Los Angeles, California 90089, USA 4Department of and Astronomy, University of Southern California, Los Angeles, California 90089, USA Recent empirical results using quantum annealing hardware have shown that mid anneal pausing has a surprisingly beneficial impact on the probability of finding the ground state for of a variety of problems. A theoretical explanation of this phenomenon has thus far been lacking. Here we provide an analysis of pausing using a master equation framework, and derive conditions for the strategy to result in a success probability enhancement. The conditions, which we identify through numerical simulations and then prove to be sufficient, require that relative to the pause duration the relaxation rate is large and decreasing right after crossing the minimum gap, small and decreasing at the end of the anneal, and is also cumulatively small over this interval, in the sense that the system does not thermally equilibrate. This establishes that the observed success probability enhancement can be attributed to incomplete quantum relaxation, i.e., is a form of beneficial non-equilibrium coupling to the environment.

I. INTRODUCTION lems [23] and training deep generative machine learn- ing models [24]. Numerical studies [25, 26] of the p-spin Quantum annealing [1–4] stands out among the multi- model also agree with these empirical results. However, tude of concurrent approaches being developed to explore despite a useful qualitative explanation offered for the , as having achieved the largest scale thermalization mechanism by which pausing improves to date when measured in terms of the sheer number success probabilities [21], a thorough analysis of the ex- of controllable qubits. Today’s commercial quantum an- act mechanism of this phenomenon is still lacking. Here nealers feature thousands of superconducting flux qubits we provide such an analysis, and identify sufficient con- and are being used routinely to test whether this ap- ditions for pausing to provide an enhancement. proach can provide a quantum advantage over classical Our analysis is based on a detailed investigation of a computing [5–11]. While there is no consensus that such quantum two-level system model coupled to an Ohmic an advantage has been demonstrated, there is significant bath. The two-level system can be either a single qubit progress on the development of “software” methods that or a multi-qubit system whose lowest two energy levels improve quantum annealing performance. Such meth- are separated by a large gap from the rest of the spec- ods take advantage of the advanced control capabilities trum. The analysis builds on tools from the theory of of quantum annealers to implement protocols that result open quantum systems [27–29], specifically master equa- in higher success probabilities, shorter times to solution, tions appropriate for time-dependent (driven) Hamiltoni- faster equilibration, etc. Continued progress in this di- ans [30–35]. Through numerical investigation we identify rection is clearly critical as a complementary approach to a set of sufficient conditions, stated in term of the prop- improving the underlying hardware by reducing physical erties of the relaxation rate along the anneal, and prove source of noise and decoherence. a theorem guaranteeing that it is advantageous to pause Among the various empirical protocols that have been mid-anneal. The advantage gained is a higher success developed to improve the performance of quantum an- probability than is attainable without pausing. nealing, such as error suppression and correction [12–15] We thus establish, in a rigorous sense, that there ex- arXiv:2005.01888v2 [quant-ph] 5 Aug 2020 and inhomogeneous driving [16–19], the mid-anneal paus- ists a non-trivial optimal pausing point under a set of ing protocol stands out as particularly powerful. Pausing reasonable assumptions. We do not identify the optimal superficially resembles the idea of slowing down near the pausing duration, but we do prove that the optimal paus- minimum gap, as in the optimal schedule for the Grover ing point occurs after the minimum gap, in accordance problem [3, 20], but the context here is entirely different with the prior empirical and numerical evidence. This due to the fact that pausing happens in an open sys- result is stated in Theorem1 below, which can be sum- tem subject to thermal relaxation. The first study [21] marized as saying that an optimal pausing point exists if to systematically test this approach empirically using a the relaxation rate right after the minimum gap is large D-Wave 2000Q device [22], demonstrated a dramatic im- relative to the pause duration but small at the end of the provement in the probability of finding the ground state anneal, decreases right after crossing the minimum gap (i.e., the success probability) when an anneal pause was and also at the end of the anneal, and is also cumulatively inserted shortly after crossing the minimum gap. Follow- small over this interval, in the sense that the system does up studies confirmed that pausing is advantageous on not fully thermally equilibrate. different problems such as portfolio optimization prob- The structure of this paper is as follows. In Sec.II we 2 define our model of the two-level system. In the multi- where Ω(s) and θ(s) are a reparameterization of the qubit case this involves deriving an effective Hamiltonian annealing schedules: A(s) = Ω(s) cos θ(s) and B(s) = for the projection to the low energy subspace of the full Ω(s) sin θ(s). (see AppendixA for a detailed explana- Hamiltonian. We introduce a certain parametrization of tion). We call θ(s) the annealing angle and θ˙(s) the an- the gap and the geometric phase that ensures the problem gular progression. The term θY/˙ 2 has its origin as a geo- is hard for quantum annealing, in the sense that the suc- metric phase [37]. Loosely, Ω(s) corresponds to the time- cess probability is low even on a timescale that is large dependent gap and dθ/ds corresponds to how fast/slow compared to the inverse of the minimum spectral gap the Hamiltonian changes. For a typical single qubit an- along the anneal. In Sec.III we treat the same model nealing process, the boundary condition θ(0) = 0 and as an open quantum system using master equation tech- θ(1) = π/2 needs to be satisfied (noticing that in Eq. (1) niques, specifically the Redfield equation with and with- we permuted the Pauli X and Z matrices in the standard out the rotating wave approximation, and the adiabatic notation). master equation. Then, in Sec.IV we introduce a pause into the annealing schedule and study its effects. We first demonstrate numerically that an optimal pausing posi- B. Projected TLS from a multi-qubit model tion exists before the end of of the anneal, depending on the monotonicity properties of the relaxation rate after For general multi-qubit annealing, the Hamiltonian is the minimum gap is crossed. Building on these obser- vations we then prove a theorem establishing sufficient X H (s) = A(s)H + B(s)H = E (s) |n(s)ihn(s)| . conditions for the existence of such an optimal pausing S d p n n point. We conclude in Sec.V, and present additional (3) technical details in the Appendix. where {|n(s)i} is the instantaneous energy eigenbasis and En(s) are the instantaneous energies. Hd and Hp are the driver and problem Hamiltonian, respectively. Hence- T II. “HARD” SINGLE-QUBIT AND forth we assume that HS = (HS) , i.e., that HS(s) is MULTI-QUBIT CLOSED SYSTEM MODELS real for all s. The system density matrix can be written in the instantaneous energy eigenbasis: In this section we consider two scenarios: single qubit X annealing, and a projected two-level system (TLS) aris- ρ(s) = ρnm |nihm| . (4) ing from multi-qubit annealing. We define a model that nm makes these problems “hard” for quantum annealing, in the sense of a small success probability even over a We call the associated matrixρ ˜ = [ρnm] the density ma- trix in the adiabatic frame, and show in AppendixB that timescale that is long compared to the heuristic adiabatic h i timescale (given by the inverse of the minimum gap along it obeys the von Neumann equation ρ˜˙ = −i H,˜ ρ˜ with the anneal path). the effective Hamiltonian   tf E0 −i 0 1˙ ... ˙ A. Single qubit H˜ = i 0 1 tf E1 ... (5)  . .  . .. We write the single qubit annealing Hamiltonian in the form If we truncate the effective Hamiltonian (5) to the lowest two energy levels and shift it by a constant term, we find: 1  HS(s) = − A(s)Z + B(s)X , (1) 2 ˜ tf Ω(s) H2 = 0 1˙ Y − Z, (6) 2 where s = t/tf is the dimensionless instantaneous time, t is the actual time, tf is the total anneal time, and where Ω(s) = E1 − E0 is the energy gap between the A(s) and B(s) are the annealing schedules. Note that lowest two energy levels. We call this the projected TLS we permuted the Pauli matrices X and Z of the con- Hamiltonian. An alternative way to derive this effective ventional single qubit annealing Hamiltonian in order to Hamiltonian is via the well-known adiabatic intertwiner have the same expression for both the single qubit and (see, e.g., Ref. [38]). This TLS approximation is valid projected TLS cases. After transforming to the adiabatic when (i) there is a large gap separating the two-level frame [36], the dimensionless interaction picture Hamil- subspace from higher excited state (where “large” is in tonian becomes: the sense of the adiabatic theorem [39]), and (ii) the ge- ometric terms connecting the two-level subspace to the higher levels in the adiabatic frame are negligible [40]. 1dθ  One may also invoke the Schrieffer-Wolff transformation H˜ (s) = Y − t Ω(s)Z , (2) S 2 ds f to establish similar conditions [41, 42]. 3

Since the annealing Hamiltonian [Eq. (3)] is real, 0 1˙ and the annealing angle resulting from Eq. (8) is in Eq. (5) is also real. This effective Hamiltonian is equiv- ˙ ˙ π   µ  s − µ   alent to Eq. (2) with 0 1 playing the role of θ/2. Thus, θ(s) = erf θ + erf √ θ . (10) we can define the angular progression as θ˙ = 2 0 1˙ for 4 2αθ 2αθ the projected TLS. Having done so, a general TLS Hamil- In order to ensure the hardness of the problem and mono- tonian can also be written as Eq. (2), where the annealing tonic schedules, we need to overlap the peak region of θ˙ angle θ in the general case does not need to satisfy the and the minimum gap, i.e., µ ≈ µ . In such a region, same boundary condition as in the single qubit case. θ g the diabatic term is much larger than the adiabatic term and the Landau-Zener transition is strong. A choice of C. Hard Problem Instances from Gap and Angular such schedules is illustrated in Fig.1. It is important Progression Considerations to note that, in this construction, the hardness of the problem is not solely determined by the minimum gap ∆. Indeed, the rigorous adiabatic condition involves the We call an instance “easy” when a high ground state derivative of the Hamiltonian as well [39, 43]. An ex- probability is achieved within an annealing time that is ample of an easy instance with a small minimum gap is much shorter than the timescale set by the inverse of given in AppendixD. the minimum gap along the anneal (we refer to this as the “heuristic” adiabatic condition; it is not to be con- fused with the rigorous adiabatic condition, which pro- vides a sufficient condition for convergence to the ground state [39, 43]). Conversely, we call an instance “hard” 1.5 when the ground state probability is low for such an an- nealing time. Our strategy is to create toy models that share the 1.0 same features as certain known hard examples for quan- tum annealing [25, 44]. By closely examining those prob- lems, we identify one crucial characteristic they share: a 0.5 sharp peak in the angular progression appears at the min- imum gap, along with a π jump of the annealing angle θ (see AppendixC where these examples are illustrated). 0.0 We take a reverse engineering approach by first specify- 0.00 0.25 0.50 0.75 1.00 ing the analytic form of the gap Ω(s) and angular progres- sion θ˙(s) in the adiabatic frame. The gap is parametrized as a Gaussian in the form FIG. 1. Example schedules with the following parameter choices: µθ = µg = 0.5, αg = 0.5, E0 = 15/π GHz, ∆ = 0.001 (s−µ )2 ! − g and αθ = 1/100. 2α2 Ω(s) = E0 1 − (1 − ∆)e g , (7)

where the parameters ∆, µg and αg respectively control the gap size, position and width. The angular progression E. Projected TLS can also be chosen as Gaussian

2 The boundary conditions for the projected TLS are − (s−µθ ) 2α2 different from the single qubit case because there is no θ˙(s) = Ce θ (8) simple relation between the schedules and the annealing angle. However, a common feature of the small gap in- with position and width parameter µθ , αθ. The normal- ization constant C is chosen according to the boundary stances [25, 44] is a localized pulse of angular progression condition. We will discuss both the single qubit and pro- that is present at the minimum gap. Also, this pulse in- jected TLS cases. duces a step-function like π shift of the annealing angle across this region, leading to a near-perfect Landau-Zener transition. The simplest toy model we can construct is D. Single qubit to keep the Gaussian form of the gap [Eq. (7)] and an- gular progression [Eq. (8)] but use a different boundary As a consequence of the aforementioned boundary con- condition θ(0) = 0 and θ(1) = π, which comes from the ditions θ(0) = 0 and θ(1) = π/2. the normalization con- examination of both the p-spin model [25] and the D- Wave 16-qubit gadget problem [44]. In this case, the stant is √ √ normalization constant becomes C = π/ 2αθ. rπ 1 However, we stress that the core of this construction is C = , (9) 2 2αθ the angular progression pulse at the minimum gap. There 4

˙ is no constraint on θ(s) at other s as long as the system By assuming {Bα(s)}α are independent, the Redfield can follow its eigenstates before and after the pulse. In equation in the adiabatic frame can be shown to be [36]: fact, in the problem studied in Ref. [44], the annealing angle first gradually decreases to a non-zero value before ρ˙S(s) = −i[HS(s), ρS(s)] (18) the π jump (see AppendixC). X 2 − (gαtf ) [Sα(s), Λα(s)ρS(s)] + h.c. , α III. OPEN SYSTEM MODEL where Z s 0 0 0 0 † 0 For the open quantum system model, we directly start Λα(s) = ds Cα(s, s )U(s, s )Sα(s )U (s, s ) , (19) with the multi-qubit case. We adopt a standard noise 0 model for quantum annealing: each qubit couples to a and bosonic bath via a system operator O:  Z s  0 00 00 X † X U(s, s ) = T+ exp −itf HS(s ) ds , (20) 0 HSB = gαkOα ⊗ (bαk + bαk) = gαOα ⊗ Bα . (11) s αk α and T+ denotes time-ordering. An important observation Here gα and Oα are dimensionless and Bα has dimen- is that, after moving to the adiabatic frame, the trans- sions of energy. The parameters gα serve as expansion formed system Hamiltonian [Eq. (2)] has a different gap variables, which can later be set to one. After moving to than the original one [Eq. (1)], due to the rescaling by tf . the adiabatic frame, the system-bath interaction becomes We define the new gap (in energy units) in the adiabatic frame as ˜ X mn HSB(s) = tf gαOα (s) ⊗ Bα , (12) q ˙2 2 2 αmn ∆(s) = θ (s)/tf + Ω (s) . (21) where

mn B. Redfield Equation with Rotating Wave Oα (s) = hm(s)|Oα|n(s)i |mihn| . (13) Approximation By defining From our construction of Ω(s) and θ˙(s) [Eqs. (7) and X mn (8)], it follows that at the minimum gap point of s = αg, Sα(s) = Oα (s) , (14) mn ∆(s) is large. As a consequence, we can safely apply the rotating wave approximation (RWA) with the adiabatic the total projected TLS Hamiltonian in the adiabatic frame Redfield Eq. (18) without worrying about the pres- frame can be further simplified as ence of a small gap. After the RWA, Eq. (19) becomes 1  X H˜ = θ˙(s)Y − t Ω(s)Z + g t S (s) ⊗ B + H . ρ˙S = −i[HS(s) + HLS, ρS] 2 f α f α α B  α − Γd ρba |biha| + ρab |aihb| (22) (15) −β∆   From now on, for conciseness we will omit the the tilde + Γt ρaa − e ρbb |bihb| − |aiha| , symbol for adiabatic frame operators. We investigate where ρ = ha|ρ |bi with {|ai , |bi} being the ground three approaches for solving the open system dynamics. ab S and excited states of HS(s). All quantities in Eq. (22) are s-dependent, and the effective dephasing and thermaliza- tion rates Γ and Γ , respectively, are given by [36]:1 A. Redfield Equation d t t Γ (s) = f Γ (s)1 + e−β∆(s) Before proceeding, we define the bath correlation func- d 2 t tion tf X 2 + γ (0)Saa − Sbb (23a) 2 α α α 0 0 ∗ 0 α Cαα0 (s, s ) = Tr[Bα(s)Bα0 (s )ρB] = Cα(s , s) (16) X 2 ab 2 Γt(s) = tf gαγα(∆) Sα (23b) in terms of the rotated Bα operator α

† Bα(s) = UB(s)BαUB(s) , (17) where UB(s) = exp[−itf HBs] is the free bath evolution. 1 The expression we arrive at here for Γ here is slightly different 0 We call a set of {Bα(s)}α independent if Cαα0 (s, s ) = from Ref. [36] since here the RWA is done in the adiabatic frame, 0 0 0 0 δαα0 Cα(s, s ) ∀α, α , and identical if Cα(s, s ) = C(s, s ) while in Ref. [36] the RWA is done in an additional rotating ∀α. frame. 5 where the projected system-bath coupling operators are: 1.00 ab Sα = ha|Sα(s)|bi . (24)

The Lamb shift is: 0.75 X 2 HLS(s) = gαtf (Sα(∆(s)) |bihb| + Sα(−∆(s)) |aiha|) . α (25) 0.50 The functions γα(ω)/2 and Sα(ω) are the real and imag- inary parts of the noise spectral density (the one-sided Fourier transform of the bath correlation function). 0.25

C. Adiabatic Master Equation 0.00 0.0 0.5 1.0 1.5

Outside the peak region of the angular progres- sion, Eq. (22) becomes the adiabatic master equation FIG. 2. Example of annealing parameter s against dimen- (AME) [30]. The AME is a special case of Eq. (22). sionless time τ = t/tf . The pause position sp and duration It can be derived by ignoring the geometric part of the sd are choose as sp = 0.5 and sd = 0.5. Hamiltonian in the unitary part of the Redfield Eq. (18), which holds in the adiabatic limit tf  1. It has the same form as Eq. (22), with ∆(s) being the physical gap where dθ/dτ can be calculated using the chain rule: [Eq. (21) with θ˙ = 0] and {|ai , |bi} being the instanta- ( dθ 0 s < τ ≤ s + s neous eigenstates. = p p d . (29) dτ dθ/ds elsewhere

IV. PAUSING

So far we only considered the case of a linear annealing B. Numerical Results parameter s = t/tf . In this section we study the effect of including a pause. For our numerical simulations, we use the Gaussian gap [Eq. (7)] and angular progression [Eq. (8)] with two different boundary conditions: θ(1) = π/2 and θ(1) = π. A. Model The other parameters are the same as in Fig.1. The instantaneous populations during a 100(ns) anneal are To incorporate a pause, let us define s = s(τ; sp, sd) shown in Fig.3. We observe that the role of boundary where τ = t/tf is the dimensionless time, and where sp conditions in our setup (with a single minimum gap) is and sd are the pausing position and pausing duration, re- to determine the portion of population transferred to the spectively. The explicit form of s(τ; sp, sd) is given below, excited state when traversing the minimum gap. where from now on we suppress the explicit dependence The simplest Sα(s) we consider is inspired by the on sp and sd for simplicity, and is illustrated in Fig.2: single qubit model with both dephasing and relaxation noise. In this case, {Oα} ≡ {X,Z} in the interac-  tion Hamiltonian (11). In the adiabatic frame we have τ τ ≤ sp  {Sα(s)} ≡ {Z(s),X(s)}, where s(τ) = sp sp < τ ≤ sp + sd . (26)  τ − sd sp + sd < τ ≤ 1 + sd Z(s) = cos[θ(s)]Z − sin[θ(s)]X (30a) Note that pausing increases the total annealing time to X(s) = cos[θ(s)]X + sin[θ(s)]Z. (30b) 0 t = τf tf , τf = 1 + sd . (27) f Furthermore, we assume the {Bα(s)}α are independent To prevent confusion, henceforth we will denote by Q(s) and identical with an Ohmic spectral density [30]: the original quantity and by Q(τ) the corresponding ω 2 − ηg ωe ωc paused quantity, where Q can be any function or oper- γ(ω) = , (31) ator. 1 − e−βω

The dimensionless Hamiltonian in the adiabatic frame 2 then becomes: where ηg is a dimensionless system-bath coupling con- stant and ωc is the high-frequency cutoff. 1 dθ  X H(τ) = X − t0 Ω(τ)Z +t0 g S (τ)⊗B +H , In Fig.4, results of the three variants of MEs described 2 dτ f f α α α B α in Sec.III are compared against the closed system case. (28) Clearly, the relaxation present in the open system case 6

1.00 C. Theoretical Analysis

We now present a theoretical analysis in support of the 0.75 numerical results above, and to address the experimental findings in Ref. [21]. Without loss of generality, we will assume θ(1) = π/2 in the derivation. The proof also 0.50 goes through for other boundary conditions. The first assumption we make is the localization of the geometric phase. To be more specific, we assume 0.25 ˙ θ(s) ≈ 0 ∀s∈ / [µθ − cαθ, µθ + cαθ] (32)

where c is a dimensionless constant and cαθ measure the 0.00 effective width of the Gaussian pulse. Later, we will 0.00 0.25 0.50 0.75 1.00 take the limit αθ → 0. This condition holds for all the hard instances we consider in this work. In the small αθ FIG. 3. Populations of instantaneous eigenstates during an limit, the Redfield Eq. (18) can be treated separately in- anneal with total time tf = 100(ns). δθ denotes the jump of side/outside the region [µθ − cαθ, µθ + cαθ]. Within this the annealing angle θ across the minimum gap region. In our Landau-Zener (LZ) region, the geometric phase domi- model, the magnitude of this jump is directly determined by nates and all the other terms can be considered as a the boundary condition. perturbation. With the detailed derivation given in Ap- pendixE, we prove that, in the limit of weak coupling and small αθ, the evolution across the LZ region can be approximated by a “diabatic pulse” unitary of the fol- drastically increases the ground state probability. In the lowing form parameter region we consider, all three variants of MEs  cos(ϕ) −i sin(ϕ) give the same predictions. In principle, the adiabatic U = , (33) and RWA version of Redfield equation are unreliable in −i sin(ϕ) cos(ϕ) small gap problems [34]. However, in our construction where the small gap/diabatic region is so narrow that any error √ π 2 2 −(tf /tad) introduced during that period can be safely ignored. This ϕ = e , tad = . (34) 4 α R µθ Ω(s) ds numerically justifies the delta function approximation we θ 0 make in Sec.IVC below. In the limit αθ → 0 the closed-system adiabatic time scale t diverges. This is an approximation to real com- In addition, numerical results for a pausing schedule of ad putational (small gap) problems where the adiabatic time the type shown in Fig.2 are presented in Fig.5. In these scale is infinite for practical purposes. simulation we fixed the pausing duration s and investi- d Outside the LZ region, dθ ≈ 0 and hence the Hamil- gated the final success probability for different pausing dτ tonian (28) can be written as positions sp. Four primary observations arise from these 0 0 results: (1) Increasing the total annealing time t im- tf X f H(τ) = − Ω(τ)Z + t0 g S (τ) ⊗ B + H , (35) proves the final success probability. This is because we 2 f α α α B are neither close to the adiabatic limit (closed system) α nor to thermal equilibrium (open system). (2) There We also assume the position of the pause is outside the is a peak (or more precisely, oscillations) in the success LZ region: probability if we pause right around the minimum gap.  − + ± sp ∈/ µ , µ , µ ≡ µθ ± cαθ . (36) However, for the parameters chosen here this is mainly θ θ θ a closed system effect which is suppressed in the open This is in accordance with the experimental protocol of system setting (we show Sec.IVD below that the choice Ref. [21], where pausing was found to be effective past the of the cutoff frequency matters a great deal). The os- position of the avoided crossing, and is explained theo- cillations can be understood as the interference pattern retically below in terms of the absence of a pausing effect between two different paths leading to the final ground before s = µθ. state [36], because pausing splits the single region of di- Following Ref. [31], the AME can be written as abatic evolution into two and effectively creates a Mach- two fully decoupled parts, which simplifies the deriva- Zehnder interferometer [49]. (3) Pausing only helps if tion compared to the other master equations considered it happens after the minimum gap. This phenomenon above: 0 0 can be explained by our analytic model presented in ρ˙00(τ) = −ρ˙11(τ) = tf Γ01(τ)ρ11(τ) − tf Γ10(τ)ρ00(τ) Sec.IVC. (4) All the MEs still produce the same re- (37a) sults in the presence of pausing, which suggests we may 0 use the AME, since it has the simplest structure for pur- ρ˙01(τ) = −itf (ω01(τ) + Σ01(τ))ρ01(τ) − ξ01(τ)ρ01(τ) suing analytical results. (37b) 7

0.8

0.6

0.7 Schrodinger Schrodinger 0.4 AME AME Redfield Redfield 0.6 Redfield+RWA 0.2 Redfield+RWA

0.5 0.0 200 400 600 800 1000 200 400 600 800 1000

FIG. 4. Success probability (without pausing) calculated in both closed and open system settings with boundary conditions: (a) θ(1) = π/2; (b) θ(1) = π. Open system simulations were done with all three variants of MEs described in Sec.III, with an Ohmic bath spectral density [Eq. (31)]. Here and below the Ohmic bath parameters were chosen as typical of flux qubits (e.g., 2 −4 Refs. [30, 44–48]): 2πηg = 10 , T = 16(mK) and ωc/2π = 4(GHZ). The results of the open system simulations overlap. Note the different vertical axis scales in (a) and (b). where After crossing the LZ region, using Eq. (33) the state becomes ω01(τ) = −ω10(τ) = −Ω(τ) (38a) X 01 2 βΩ(τ)  P −i(P − P ) sin 2ϕ Γ01(τ) = γα(Ω(τ)) Sα (τ) = e Γ10(τ) (38b) + ϕ 0 1 ρ(µθ ) = (41a) α i(P0 − P1) sin 2ϕ 1 − Pϕ X 01 2 − Σ01(τ) = −Σ10(τ) = Sα (S(ω01) − S(ω10)) (38c) P0 = P (µθ ) = 1 − P1 (41b) α 2 2 Pϕ = P0 cos ϕ + P1 sin ϕ . (41c) 1 1 X 00 11 ξ01(τ) = (Γ01 + Γ10) + γα(0) S + S (38d) 2 2 α α α Note that Pϕ is the ground state population right after the minimum gap is crossed. We can see from Eqs. (40) Our primary interest is in the ground state population and (41) that pausing before µ has no effect on the final [Eq. (37a)]. It can be rewritten as θ results. Therefore, in the following discussion, we will   assume the pausing position is after the diabatic pulse:  −βΩ(τ) + ρ˙00(τ) = Γ(τ) 1 − 1 + e ρ00(τ) (39a) sp > µθ. The solution of Eq. (39a) from µθ to τf = 1+sd can be written as 0 Γ(τ) ≡ tf Γ01(τ) , (39b) " # Z τf  −βΩ(τ) where Γ(τ) is the dimensionless relaxation rate subject to ρ00(τf ) = exp − dτ 1 + e Γ(τ) 2 + pausing. The off-diagonal elements of the density matrix µθ ( decay exponentially with a rate determined by ξ01(τ). As Z τf a consequence, if we start in the ground/thermal state × Pϕ + dτ Γ(τ) (42a) + of the initial Hamiltonian, right before the LZ region, µθ the system density matrix will to an exponentially good " τ #) Z  0  approximation have only diagonal elements: × exp dτ 0 1 + e−βΩ(τ ) Γ(τ 0) + µθ  −  − P (µθ ) 0 = Fd(τf , sp, sd) + Fg(τf , sp, sd) , (42b) ρ(µθ ) = − . (40) 0 1 − P (µθ ) where

2 + Recall our convention of denoting by Q(s) the original quantity Fd(τf , sp, sd) = PϕG(µθ ) (43a) and by Q(τ) the corresponding paused quantity. In the case of Z τf 0 Γ(s) this includes the paused anneal time tf , so in that sense it Fg(τf , sp, sd) = dτ Γ(τ)G(τ) , (43b) + represents a mixed quantity. µθ 8

1.0 1.0

0.9 0.9

0.8 0.8

0.7 0.46 0.48 0.50 0.52 0.54 0.7

0.6 0.6

0.5 0.5 0.2 0.4 0.6 0.8 0.2 0.4 0.6 0.8

(a) (b)

1.0 1.0

0.9 0.9

0.8 0.8

0.7 0.7

0.6 0.6

0.5 0.5 0.2 0.4 0.6 0.8 0.2 0.4 0.6 0.8

(c) (d)

FIG. 5. Success probability for pausing schedules as shown in Fig.2. The boundary condition for every plot is θ(0) = 0 and θ(1) = π/2 (for θ(1) = π, the plots are of similar shapes, with the lowest success probability being 0). For each curve the pausing duration sd is fixed and the pausing position sp is varied. (a) Closed system. The inset zooms in around sp = 0.5. 0 (b) AME. (c) AME success probabilities for different tf values with sd = 4. (d) Final success probabilities from three MEs with sd = 1. For (a), (b) and (d), the original total annealing time tf is set to 100(ns) and the Ohmic bath parameters are 2 −4 chosen as: 2πηg = 10 , T = 16(mK) and ωc/2π = 4(GHZ). The dashed lines in (b) and (c) indicate the thermal ground state population of the final Hamiltonian. and where following optimization problem:

 τf  Z argmax F (τ , s , s ) + F (τ , s , s ) . (46) G(τ) = exp − dτ 0 X(τ 0) (44a) d f p d g f p d {sp,sd} τ   X(τ) = 1 + e−βΩ(τ) Γ(τ) . (44b) D. Numerical evidence for an optimal pausing position Note that the functions Ω(τ) and Γ(τ) have an implicit dependence on s and s : p d First, noticing that the quantities defined in Eqs. (44b), (45a), and (45b) can be functions of either Ω(τ) = Ω(s(τ)) (45a) s (unpaused) or τ (paused), to avoid any ambiguity we X 01 2 ¯ Γ(τ) = (1 + sd)tf γα[Ω(s(τ))] S (s(τ)) , (45b) henceforth use the notation Q(s) when the argument is s. α ¯ α For example, Γ(1) means Γ(s = 1) instead of Γ(τ = 1). Note that this modifies our previous convention of de- where we combined Eqs. (38b) and (39b). To achieve noting by Q(s) the original quantity and by Q(τ) the the maximum success probability, we need to solve the corresponding paused quantity. 9

1.25 4.5 1.5 4.0 1.00 3.5 1.4 0.75 3.0 1.3 2.5 0.50 2.0 1.2 0.25 1.5 0.5 0.6 0.7 0.8 0.9 1.0 0.5 0.6 0.7 0.8 0.9 1.0 0.5 0.6 0.7 0.8 0.9 1.0

FIG. 6. Γ(¯ s) [the dimensionless relaxation rate in Eq. (38b) as a function of s, i.e., with τ replaced by s] for an Ohmic bath with different cutoff frequencies: ωc/2π = 0.5 (GHz), monotonically decreasing Γ(¯ s) (left); ωc/2π = 1 (GHz), non-monotonic (middle); ωc/2π = 4 (GHz), monotonically increasing Γ(¯ s) (right). .

1.0 1.0 1.0

0.9 0.9 0.9

0.8 0.8 0.8

0.7 0.7 0.7

0.6 0.6 0.6

0.5 0.5 0.5 0.2 0.4 0.6 0.8 1.0 0.2 0.4 0.6 0.8 1.0 0.2 0.4 0.6 0.8 1.0

FIG. 7. Success probability vs pausing position for the three Ohmic bath cases shown in Fig.6, in the same order from left to right. The existence of a maximum in the left and middle panels shows that an optimal pausing exists for the corresponding decay rates. The qubit frequency Ω(s) [Eq. (2)] changes between 1.88GHz and 4.77MHz during the anneal. Other parameters 2 2 −4 are: ηgz = ηgy = 10 /2π and T = 16(mK). The analytic solution refers to Eq. (42). For master equation simulations, the schedules are chosen according to Fig.1. The pausing duration is fixed at sd = 4. All three panels share the same legend.

∗ Then, we analyze the optimal pausing position sp, for equal importance to the noise spectrum. For example, a given pausing duration sd. We consider an Ohmic bath in the 16-qubit problem of Ref. [44] with pure dephas- with different cutoff frequencies. This leads to different ing couplings, the strength of the projected system-bath behaviors of the dimensionless relaxation rate Γ(¯ s), from P 01 2 coupling operators α Sα (s(τ)) plays a key role as decreasing to non-monotonic to increasing, as illustrated well: it has a peak around the minimum gap region and in Fig.6. Before proceeding, we emphasize that, as is decreases to almost zero afterwards. clear from Eq. (45b), the monotonicity properties of Γ(¯ s) The corresponding success probabilities as a function depend on three different factors: of pausing position obtained by numerically solving the AME and via the analytic expression (42) are shown in • The gap of the projected TLS Fig.7, and are in excellent agreement. There are two • The projected system-bath coupling operators main observations: [Eq. (24)] • An optimal pausing position exists in the middle of the anneal for the cases illustrated in Fig. 6(a) and • The noise spectrum Fig. 6(b) where Γ(¯ s) is not monotonically increas- In our examples, the first two are fixed by Eqs. (7) ing. and (30) respectively. Thus, at a given temperature, • When Γ(¯ s) is monotonically increasing after the what remains is only the noise spectrum, or more specif- minimum gap, it is always better to (trivially) ically the only tuning parameter for the monotonicity of pause near the end of anneal. Γ(¯ s) is the Ohmic bath cutoff frequency ωc. The relation- ship between ωc and the monotonicity of γ(ω) within the Guided by these numerical results, we next prove that qubit frequency range, which is sufficient to determine a non-trivial optimal pausing position exists provided the monotonicity of Γ(¯ s), is not straightforward. Thus that the dimensionless relaxation rate given in Eq. (39b) the values in Fig.6 were chosen after numerical investi- (i.e., comparing the actual relaxation rate Γ01 to the an- 0 gation. More generally, the first two factors above are of neal time tf ) is monotonically decreasing, with respect 10 to s, from the end of the avoided crossing to the end of We thus divide the proof into two parts, one for each of the anneal. these two inequalities.

E. Existence of optimal pausing point 1. Proof of ∂ρ00 = ∂Fg + ∂Fd > 0 ∂sp + ∂sp + ∂sp + sp=µθ sp=µθ sp=µθ In this section we prove the following theorem, which provides sufficient conditions for the existence of a non- Consider first the Fd term [Eq. (43a)]: trivial optimal pausing position, thus generalizing and " τ # formalizing the numerical evidence exhibited above. Z f Fd = Pϕ exp − dτ X(τ) . (49) µ+ Theorem 1 (Optimal pausing point) Let Γ(¯ s) = θ (1 + s )t Γ¯ (s) be the dimensionless relaxation rate, let d f 01 The partial derivative at s = µ+ is the instantaneous ground state probability after the min- p θ imum gap is crossed be denoted Pϕ [Eq. (41c)], and let ∂Fd ¯ 1 ¯ 0 + + P (s) = ¯ be the thermal ground state probability = −PϕsdX (µθ )G(µθ ) , (50) th 1+e−βΩ(s) ∂s s =µ+ at s. p p θ ∗ There exists a non-trivial optimal pausing point sp ∈ R τf ∂X(τ) where the factor of sd arises from + dτ = +  ∂sp + µθ s =µ µθ , 1 for a fixed pausing duration sd, i.e., p θ µ++s ¯ 0 + R θ d ∂ρ X (µθ ) + 1dτ. The relation between ∂X/∂sp and 00 µθ = 0 , (47) 0 ∗ ¯ ∂sp sp=sp X is detailed in AppendixF. Likewise, the partial derivative at sp = 1 is if ∂F 1. The dimensionless relaxation rate decreases at the d ¯ 0 + = −PϕsdX (1)G(µθ ) . (51)  +  ¯0 + ∂sp sp=1 boundaries of the interval µθ , 1 , i.e., Γ µθ < 0 and Γ¯0(1) < 0.3 As for Fg, again in AppendixF we derive the following 2. The dimensionless relaxation rate is large right af- identities: ¯ + ter the minimum gap: sdΓ µθ > c1 where c1 = Z 1  ∂Fg + Sx¯ 0 + O(1). = sdG µ dx e Γ¯ µ (52a) + θ θ ∂sp sp=µθ 0 3. The dimensionless relaxation rate is small at the  − s Γ¯µ+X¯ 0(µ+)(1 − x) , S¯ ≡ s X¯µ+ end of the anneal: sdΓ(1)¯ < c2 where c2 = O(1). d θ θ d θ 4. The ground state population at the end of the anneal Z 1 ∂Fg ¯ 0 is subthermal: = −sdX (1) dτ G(τ)Γ(τ) (52b) R 1 ¯ ∂sp µ+ − + Γ(s)ds sp=1 θ µ θ ¯ 1 − (1 − Pϕ)e ≤ Pth(1). Z 1+sd  0 0  + dτ G(τ) Γ¯ (1) − Γ(1)¯ X¯ (1)(sd + 1 − τ) We remark that the existence of a non-trivial optimal 1 pausing point is possible under a substantially broader set of conditions than implied by Theorem1, as is shown To make further progress we now assume that: in our proof below. The Theorem states a simplified set Assumption 1 Γ(¯ s) is decreasing at the boundaries of of conditions for ease of presentation and interpretation. the interval µ+, 1,4 i.e., The reader who is not interested in the technical details θ of the proof may skip ahead to the conclusions in Sec.V. ¯0 + ¯0 Γ µθ < 0 , Γ (1) < 0 . (53)

0 + Note that Ω¯ (µ ) > 0 since the gap grows for s > µθ F. Proof of Theorem1 θ [recall Eq. (7) and that we assumed µθ ≈ µg]. As a consequence, upon taking the derivative of Eq. (44b) we The proof follows. Because ∂ρ00 is a continuous func- ∂sp obtain tion of sp, a sufficient condition for Eq. (47) is X¯ 0 = −βΩ¯ 0e−βΩ¯ Γ¯ + Γ¯0(1 + e−βΩ¯ ) (54) ∂ρ ∂ρ 00 > 0 and 00 < 0 . (48) + ∂sp sp=µθ ∂sp sp=1

4 These two assumptions are deduced from our observations of examples we have studied numerically. In principle, one could 3 Note that henceforth, for notational simplicity, we use the prime choose a different set of assumptions and perform a similar anal- symbol for the derivative with respect to s throughout this work. ysis. The final result will be different from Theorem1. 11 and hence: Assumption 2

X¯ 0(µ+) ≤ Γ¯0(µ+) < 0 . (55) ¯ + ¯∗ ¯ + θ θ sdΓ µθ > S (x)Pth µθ , (63)

Combining Eqs. (42), (50), and (52a), we thus find the + where P¯th µ is the thermal ground state probability at ∂ρ00 θ following equivalent sufficient condition for > + ∂sp + s = µ . sp=µθ θ 0: Moreover, using the recursive expression Z 1 Z 1 ¯0 + Sx¯ ¯ + ¯ 0 + Sx¯ Γ µθ e dx > sdΓ µθ X µθ e (1 − x) dx 0 0 W−1(z) = ln(−z) − ln(−W−1(z)) , (64) + P X¯ 0µ+ . (56) ϕ θ Eq. (61) can be written as Then, by explicitly carrying out the integrals, replacing ¯ ¯ + ¯∗  −1/x  one factor of S by sdX µθ , and dividing inequality (56) S (x) = ln x + ln −W−1(−e /x) < 2 ln(x) . (65) ¯ 0 + by X µθ < 0, we have: The above upper bound is derived in AppendixH. ¯0 + S¯ ¯ +  S¯  + Γ µθ e − 1 Γ µθ e − 1 Thus we can replace Eq. (63) by s Γ¯ µ > 2 ln(x), < − 1 + P . (57) d θ ¯ 0 + ¯ ¯ + ¯ ϕ X µθ S X µθ S which grows very mildly, and can for practical purposes be replaced by an O(1) constant. This is how Assump- Let us denote tion2 is stated in Theorem1. ∗ Γ(¯ s) 1 Note that the case sd = 0 arises only when x < 0 in P¯th(s) = = (58a) ¯∗ X¯(s) 1 + e−βΩ(¯ s) Eq. (60) [since then the solution S given by Eq. (61) is negative]. This, in turn arises when P > P¯ (µ+), Γ¯0(s) ϕ th θ Q¯(s) = . (58b) i.e., when the instantaneous ground state probability is ¯ 0 + ¯ X µθ Pth(s) greater than the thermal ground state probability, both at τ = µ+. Indeed, this conforms with the expectation We then note that θ this in this case pausing is not advantageous. βΩ¯ 0e−βΩ¯ Γ¯ 1 − Q¯µ+ = > 0 , (59) θ ¯ 0 −βΩ¯ ¯ −βΩ¯ ¯0 βΩ e Γ − 1 + e Γ 2. Proof of ∂ρ00 = ∂Fg + ∂Fd < 0 ∂sp ∂sp ∂sp + sp=1 sp=1 sp=1 where the inequality holds at µθ : we know the gap is increasing so Ω¯ 0µ+ > 0 and the numerator is positive, θ ¯ 0 and Γ¯0µ+ < 0 by Assumption1, so the denominator is Note that using Eq. (54) we have X (1) < 0 since, by θ ¯0 ¯ 0 also positive. Assumption1, Γ (1) < 0, and Ω (1) > 0 since the gap Eq. (57) can thus be rewritten as grows at the end of the anneal, as per Eq. (7) (this need not always be the case [51]). eS¯ − 1 1 − P /P¯ µ+ Combining Eqs. (51) and (52b), we thus have5 > ϕ th θ ≡ x . (60) S¯ 1 − Qµ+ θ ∂F ∂F  g + d = s |X¯ 0(1)| P G(µ+)+ S¯ d ϕ θ e −1 ∂sp s =1 ∂sp s =1 Note that the function S¯ ≥ 1 and is monotonically p p increasing for S¯ ≥ 0. Therefore the inequality is auto- Z 1 Γ¯0(1) Z τf dτ G(τ)Γ(τ) − dτ G(τ) matically satisfied for x < 1 by any S¯ > 0. ¯ 0 µ+ sdX (1) 1 Let us denote by S¯∗ the solution of inequality (60) re- θ Z τf  1  placed by an equality; this transcendental equation has a + Γ(1)¯ dτ G(τ) 1 + (1 − τ) . (66) formal solution in terms of the Lambert-W function [50], 1 sd i.e. the inverse function of f(W ) = W eW : Therefore it suffices to find a condition under which the 1   expression inside the square brackets in Eq. (66) is neg- S¯∗(x) = − 1+xW (−e−1/x/x) = s∗X¯µ+ , (61) x −1 d θ ative. Our strategy for doing so is to replace this ex- pression with a simpler but negative upper bound, and where W−1(z) is one of the two real branches of W (z) satisfying

W−1(z) ≤ −1 − 1/e ≤ z < 0 , (62) 5 Note that if X¯ 0(1) > 0, the RHS of Eq. (66) is automatically ¯ ¯ 0 ¯ 0 eS −1 negative, since the prefactor X (1) comes from −X (1), and with W−1(−1/e) = −1. The function is monotoni- S¯ the − X¯ 0(1) inside the square brackets becomes X¯ 0(1), so every cally increasing, so inequality (60) is satisfied for all S¯ > term inside these brackets is positive. In this case Assumptions ¯∗ ∗ ¯∗ ¯ + 3 and 4 of Theorem1 can be dropped. However, X¯ 0(1) < 0 in max(0, S ), i.e., for all sd > sd = max[0, S /X µθ ]. We can therefore replace condition (60) with our model. 12 iterating this until we arrive at a conceptually simple fi- Defining nal expression. λ ≡ s X¯(1) = s (1 + e−βΩ(¯ s))Γ(¯ s) , (75) Now note that for all τ ∈ [1, τf ]: d d

 Z τf  we can now combine all these bounds to provide an upper 0 0  ¯  G(τ) = exp − dτ X(τ ) = exp −(τf − τ)X(1) , bound on the expression in square brackets in Eq. (66): τ   R 1 ¯ (67) − + Γ(s)ds −λ µ ··· ≤ Pϕe e θ since when sp = 1 all the schedule-dependent functions R 1 ¯ −λ are constant for τ in the range [1, τ ], as a result of − + Γ(s)ds f −λ −λ µ 1 1 − e + e − e e θ − P¯ (1) Eq. (26). 1 +  th λ Therefore, −λ ¯ 1 − e (1 + λ) ¯ + Pth(1) , (76) Z τf 1 − e−sdX(1) λ dτ G(τ) = ¯ , (68) 1 X(1) an expression we require to be negative. We thus arrive at the sufficient condition and Z 1   Z τf  1  ¯ 1 − Pϕ dτ G(τ) 1 + (1 − τ) (69a) Γ(s) ds ≤ ln ¯ , (77) µ+ 1 − Pth(1)F (λ) 1 sd θ ¯ 1 − e−sdX(1)[1 + s X¯(1)] where = d . (69b) ¯ 2 sdX(1) eλ − 1  F (λ) = 1 − . (78) We can find upper bounds involving G by using λ 1 +  X(τ) = (1 + e−βΩ(τ))Γ(τ) ≥ Γ(τ) . (70) Since Γ(¯ s) ≥ 0 for all s, the bound must positive to be sensible. Thus the argument of the logarithm must be Thus, using Eqs. (67) and (70) we have: lower bounded by 1. In order for the bound in Eq. (77) Z 1 Z 1 to be positive it is therefore sufficient to require that −s X¯ (1) − R 1dτ 0X(τ 0) dτ G(τ)Γ(τ) = e d ds e τ Γ(¯ s) + + Pϕ/P¯th(1) < F (λ) , (79) µθ µθ (71a) Without loss of generality, a sufficient condition for Z 1 −s X¯ (1) − R 1ds0Γ(¯ s0) Eq. (79) is: ∃c > 1 such that ≤ e d ds e s Γ(¯ s) (71b) + µθ 1 Pϕ  R 1 ¯  F (λ) > > . (80) − + Γ(s)ds ¯ −s X¯ (1) µ 1 + c Pth(1) = e d 1 − e θ , (71c) which is not unreasonable because in practice, we would ¯ < and expect Pth(1) ∼ 1, Pϕ ∼ 0.5 (for a hard instance) and " #   1. Substituting Eq. (78) into Eq. (80), we have Z 1  Z 1+sd  + G(µ ) = exp − dτ X(τ) exp − dτ X(τ) λ θ + e − 1 c − 1 µθ 1 < 1 + ≡ x . (81) (72a) λ 1 + c ∗ " Z 1 # Recall that the solution λ (x) to this inequality consid-   ≤ exp − ds Γ(¯ s) exp −sdX¯(1) . (72b) ered as an equality is the Lambert-W function [Eq. (61), + λ µθ ∗ ¯∗ e −1 with λ replacing S ], and the function λ is monoton- Γ¯0(1) ically increasing. Therefore Eq. (81) is satisfied as long Next, let us rewrite | ¯ 0 | by using Eq. (54). First, ∗ X (1) as λ ≡ sdX¯(1) < λ : let Assumption 3 Γ(1)¯ βΩ¯ 0(1)  ≡ . (73) ¯ ∗ |Γ¯0(1)|(1 + eβΩ(1)¯ ) sdΓ(1) < λ (x) < 2 ln(x) , (82) In AppendixI, we argue that for spectral densities with where as before we may view λ∗ in practice as an O(1) an exponential tail (e.g., the Ohmic case we consider) constant. This is how Assumption3 is stated in Theo-   1 for sufficiently large βΩ(1).¯ rem1. Then, using Assumption1 again: Finally, using the lower bound (80), Eq. (77) can be replaced with: 0 0 Γ¯ (1) |Γ¯ | = (74a) Z 1   X¯ 0(1) |Γ¯0(1 + e−βΩ¯ ) − βΩ¯ 0e−βΩ¯ Γ¯| ¯ 1 − Pϕ Γ(s) ds ≤ ln ¯ . (83) 1 1 µ+ 1 − Pth(1)/(1 + c) ¯ θ = ¯ = Pth(1) , (74b) (1 + )(1 + e−βΩ(1)) 1 +  Rewritten as 13

Assumption 4 ACKNOWLEDGMENTS

R 1 ¯ − + Γ(s)ds The authors are grateful to Jenia Mozgunov and Hum- µ 1 1 − (1 − P )e θ ≤ P¯ (1) ≤ P¯ (1) , (84) ϕ 1 + c th th berto Munoz Bauza for useful discussions and feedback. We used the Julia programming [55] and the Differen- tialEquations.jl package [56] for all our numerical calcu- this can be interpreted as follows: the excited state pop- lations. ulation right after crossing the minimum gap is 1 − Pϕ, The research is based upon work (partially) supported h R 1 i and after multiplying this by exp − + Γ(¯ s) ds we have by the Office of the Director of National Intelligence µθ what is left of this excited state population at the end (ODNI), Intelligence Advanced Research Projects Ac- tivity (IARPA) and the Defense Advanced Research of the anneal. On the other hand, P¯th(1) is the thermal ground state population assuming equilibration. In other Projects Agency (DARPA), via the U.S. Army Research words, Eq. (84) states that the actual ground state pop- Office contract W911NF-17-C-0050. The views and con- ulation reached at the end of the anneal is less than the clusions contained herein are those of the authors and thermal ground state population, i.e., the system has not should not be interpreted as necessarily representing the fully equilibrated. This is the version of Assumption4 official policies or endorsements, either expressed or im- given in Theorem1. plied, of the ODNI, IARPA, DARPA, or the U.S. Gov- ernment. The U.S. Government is authorized to repro- duce and distribute reprints for Governmental purposes notwithstanding any copyright annotation thereon.

V. CONCLUSIONS

Appendix A: Single-qubit adiabatic frame We have established numerically as well as analytically, via an open system analysis of two-level system models, that pausing-induced quantum thermal relaxation can We recall how to transform the Hamiltonian HS(s) = − 1 A(s)Z + B(s)X into the adiabatic frame. First, we play a positive role in quantum annealing, at least accord- 2 ing to the success probability metric. More specifically, reparametrize the annealing schedules in terms of the gap we have shown here that under certain conditions on the Ω(s) and rotation angle θ(s): relaxation rate after the minimum gap is crossed, the ground state probability increases when pausing occurs A(s) = Ω(s) cos θ(s),B(s) = Ω(s) sin θ(s) . (A1) before the end of the anneal. For this to occur the relax- ation rate should be decreasing both after the minimum Then, we rescale it to a dimensionless quantity by a gap is crossed and at the end of the anneal, and cumula- change of variable s = t/tf in Von Neumann equation tively small over this interval, so that the system does not fully thermally equilibrate. In addition, the pause dura- dρ = −i[t H (s), ρ] . (A2) tion should be large relative to the inverse relaxation rate ds f S after the minimum gap is crossed, but small relative to the inverse relaxation rate at the end of the anneal. This Finally, we rotate the system with respect to the unitary provides a set of sufficient conditions relating to non- equilibrium dynamics and incomplete quantum thermal U = exp[iθ(s)Y/2] . (A3) relaxation that explain the improved pause-based perfor- mance reported in a series of recent experimental quan- tum annealing studies [21, 23, 24]. The framework we The dimensionless interaction picture Hamiltonian is have established also provides tools to solve for the opti- ∗ ∂ mal pause position sp [Eq. (47)]. We expect that analytic ˜ † † ∗ HS(s) = tf U (s)HS(s)U(s) − iU (s) U(s) solutions for sp can be derived in a problem-specific man- ∂s ner with further approximations. 1dθ  = Y − tf Ω(s)Z . (A4) Our results leave open a number of interesting ques- 2 ds tions for future studies. We have not determined the op- timal pause time, nor did we demonstrate that pausing The important observation is that in this rotating frame guarantees a quantum speedup. Indeed, computationally the eigenstates of the Z operator always align with the meaningful metrics such as the time-to-solution [5] may instantaneous energy eigenstates of the original Hamilto- not be enhanced due to extra time cost incurred due to nian. So, it can be thought of as a co-rotating frame of pausing [52], and it is also possible that classical mod- the adiabatic basis. This co-rotating frame, which we re- els of quantum annealing, such as the spin-vector Monte fer to as the adiabatic frame throughout this paper, can Carlo algorithm [53], similarly benefit from pausing [54]. be extended to the multi-qubit case as described next. 14

Appendix B: Multi-qubit adiabatic frame On the other hand, we can also take the derivative of hn|HS|ni = En, which leads to Starting from the von Neumann equation (A2), we hn˙ |H |ni + hn|H˙ |ni + hn|H |n˙ i = E˙ . (B8) have S S S n X By cancelling out hn|H˙ |ni and E˙ and noticing that ρ˙ |nihm| + ρ |n˙ ihm| + ρ |nihm˙ | = S n nm nm nm hn|n˙ i is real, we can deduce that hn|n˙ i = 0. It is im- nm X portant to note that the eigenvector |ni is only uniquely − itf (En − Em)ρnm |nihm| . (B1) determined up to a constant factor of ±1. As a result, nm we need to implement a continuous constraint To derive an effective equation of motion for the density lim hn(s)|n(s + ∆s)i = 1 (B9) matrix in the adiabatic frameρ ˜ = [ρnm], we wish to write ∆s→0 |n˙ ihm| and |nihm˙ | in {|ni} basis. For example, the first to ensure the continuity of the geometric term. term can be written as This result can be extended to a general complex- X |n˙ ihm| = hn0|n˙ i |n0ihm| . (B2) valued Hamiltonian. In this case, the eigenvectors of the Hamiltonian are uniquely determined up to a constant of n0 unit modulus. However, by enforcing the continuity con- It is important to emphasize that |n˙ i means the dition (B9), we have an analytic |n(s)i that also satisfies derivative with respect to s of the n’th eigenstate of hn|n˙ i = 0. an s-dependent Hamiltonian and does not obey the A method to calculate hm|n˙ i for degenerate Hamilto- Schr¨odingerequation. We assume that HS(s) is non- nian is provided in Ref. [57]. A special case that is not degenerate. We show below that an explicit formula for discussed in this reference is when two or more states |mi hm|n˙ i is given by and |ni become degenerate in a closed interval s ∈ [sa, sb]. D E In such a case, we can still obtain a pair of orthogonal m(s) dHS(s) n(s) ds states |m(sa)i and |n(sa)i by enforcing the continuity hm|n˙ i = δmn , (B3) En(s) − Em(s) condition (B9) across the boundary. Within the interval ∗ hm|n˙ is are usually 0 in practice. A sufficient condition which directly leads to hm|n˙ i = − hn|m˙ i . Substituting for this is Eq. (B2) into Eq. (B1), we obtain lim hm(s)|n(s + ∆s)i /∆s = 0 (B10) ∆s→0 ρ˙nm = −itf (En − Em)ρnm X ˙0 X ˙ 0 for all s ∈ [sa, sb]. By expanding |n(s + ∆s)i as a Taylor − ρn0m n n − ρnm0 m m series in ∆s, Eq. (B10) reduces to n06=n m06=m = −it (E − E )ρ lim hm|ni /∆s + hm|n˙ i + O(∆s) = 0 , (B11) f n m nm ∆s→0   which implies hm|n˙ i = 0. One example of condition X ˙0 X 0 − i−i n n ρn0m + i ρnm0 hm |m˙ i (B10) is when the transverse field becomes zero during n06=n m06=m the anneal and the problem Hamiltonian has degenerate (B4) excited states. for each ρnm. Thus, an effective Hamiltonian satisfying ˙ h ˜ i ρ˜ = −i H, ρ˜ forρ ˜ = [ρnm] is the one given in Eq. (5) in Appendix C: Annealing angle behavior for the main text (where we assumed that HS(s) is real). previously studied small gap quantum annealing Let us now prove Eq. (B3). Writing the system Hamil- examples tonian in its eigenbasis as X Here we provide a brief look at the annealing angle HS(s) = En(s) |nihn| , (B5) aspect of two previously studied quantum annealing ex- n amples with small gaps. In the following examples, all we will derive the expression for the geometric term hm|n˙ i the results are produced with a linear schedule instead of under the assumption that HS(s) is non-degenerate and the D-Wave schedule used in the references. real. Taking the derivative of HS |ni = En |ni with re- The first example, shown in Fig.8(a) and (b), is the spect to s and multiplying both sides by hm|, we have p-spin model [25]. The angular progression is localized around s = 0.483. Across the region, there is a π jump ˙ ˙ hm|HS|ni − En hm|ni = (En − Em) hm|n˙ i . (B6) of the annealing angle. If m 6= n and |mi, |ni are non-degenerate, the above The second example, shown in Fig.8(c), is the D-Wave expression reduces to Eq. (B3). For the case where m = 16-qubit gadget problem [44]. Unlike the p-spin model, n, Eq. (B6) becomes the annealing angle in this case does not stay zero during the first half of the anneal. There is still a sharp π jump hn|H˙ S|ni = E˙ n . (B7) across the minimum gap region. 15

we find in the numerical example shown in Fig.9, that 1.00 the adiabatic time scale is much smaller than the inverse gap 0.75 2 tad  h/(E0∆) ≈ 142502(ns) , (D2) 0.50 where

0.25 h = max H˙ (s) , (D3) s∈[0,1] 0.00 0.00 0.25 0.50 0.75 1.00 and k·k is the operator norm [for the Frobenius norm we (a) instead find 201528(ns)].

1500 1500 0.99 1000

500 1000 0 0.475 0.480 0.485 0.490 0.98

500 0.97

0 0.00 0.25 0.50 0.75 1.00 0.96 (b) 20 40 60 80

15000 0.2 FIG. 9. Closed system success probability versus total an- 10000 nealing time. The gap is chosen according to Eq. (7) with pa- 5000 0.0 0 rameters µg = 0.5, αg = 0.5, E0 = 15/π GHz and ∆ = 0.001. 0.00 0.25 0.50 0.75 1.00

-0.2

-0.4 Appendix E: Local perturbation around the -0.6 non-adiabatic transition 0.00 0.25 0.50 0.75 1.00

(c) Our analysis in this appendix closely mirrors Ref. [36], with some modifications. We start with the Redfield Eq. (18) and define Liouville operators FIG. 8. (a) Annealing angle for the p-spin model with n = 20, p = 19 [25]. The inset in zooms in around the peak of angular progression, shown in (b). In (c) we show the annealing angle tf Ω(s) LA = i [Z, ·] (E1a) for the D-Wave 16-qubit gadget [44]. The inset illustrates the 2 angular progression. θ˙ L = −i [Y, ·] (E1b) G 2 X L = − (g t )2[S (s), Λ (s)·] + h.c. , (E1c) Appendix D: Easy problem with small gap: constant R α f α α α angular progression which represents the adiabatic, geometric and Redfield The simplest example we can construct of an easy parts in the ME. Now we present a perturbation method problem with a small gap is one with the same small for the evolution across the region [µθ − cαθ, µθ + cαθ]. gap structure as in Eq. (7) but a constant θ˙(s) = π/2. If First we rotate the equation with respect to the adia- we define the adiabatic time scale tad as the point where batic parts of the Hamiltonian and denote the resulting Liouville operators as LG˜ and LR˜. PG(t) ≥ 0.99 ∀t ≥ tad , (D1) Using any unitarily invariant norm, such as the oper- 16 ator norm, we can bound where the unitary is

X 2 ˜ kLR˜(ρ)k ≤ 4 (gαtf ) kSα(s)kkΛα(s)kkρk1 (E2a) U(µθ + cαθ, µθ − cαθ) = α  i Z µθ +cαθ  Z s exp − θ˙(s)Y˜ (s) ds , (E10) X 2 0 0 2 ≤ (2gαtf kSαk) ds |Cα(s, s )| (E2b) µθ −cαθ α 0 and Y˜ (s) is Y in the interaction picture generated by LA where we used kρk1 = 1 (trace norm), and we made use [Eq. (E1a)]: of Eq. (19). We define g = maxα gα and note that we can always choose a normalization such that Sα ≤ 1. Then ˜ † Y (s) = UA(s)YUA(s) s R s 0 0 Z −itf Ω(s )ds 2 0 0 = −ie 0 S+ + h.c , (E11) kLR˜(ρ)k ≤ (2gtf ) ds |C(s, s )| . (E3) 0 x y where S+ = (σ + iσ )/2. Substituting Y˜ (s) into Noting that the correlation function is translation- Eq. (E10), we have invariant U˜(µ + cα , µ − cα ) = exp{−φS + h.c} (E12) C(s, s0) = C(s − s0) , (E4) θ θ θ θ + the bound (E3) can be further simplified by using the where following inequality Z µθ +cαθ 1 −it R s Ω s0 ds0 φ = θe˙ f 0 ( ) ds . (E13) s ∞ Z Z 2 µ −cα ds0 |C(s, s0)| < ds |C(s)| . (E5) θ θ 0 0 Because θ˙ is highly localized within the integral limit, we Defining can further simplify the expression as

Z ∞ ∞ ∞ 1 Z µ 0 0 Z 1 −it R θ Ω s ds 1 −it µ Ω˜s = tf |C(s)| ds (E6) φ = θe˙ f 0 ( ) ds = θe˙ f θ ds , τSB 0 2 −∞ 2 −∞ (E14) where τSB can be interpreted as the fastest system deco- herence timescale [34], the final expression becomes where 2 R µθ 0 0 kLR˜(ρ)k ≤ 4g tf /τSB. (E7) Ω(s ) ds Ω˜ = 0 . (E15) This bound allows us to rigorously establish conditions µθ under which it is safe to drop the dissipative part of the Using Eq. (8), together with the boundary condition evolution. θ(1) = π/2, the integral in Eq. (E14) can be carried out, Next, using the bound above, we write down the evo- yielding lution operator and its first order Magnus expansion: π 2 ˜ φ = e−(tf /tad) +iµθ tf Ω , (E16) E(µθ + cαθ, µθ − cαθ) = (E8a) 4  Z µθ +cαθ  √ T exp −i LG˜ (τ) + LR˜(τ) dτ = (E8b) where tad = 2/(αθµθΩ)˜ [Eq. (34)]. Finally, we can µθ −cαθ write down the matrix form of the unitary (E12):  Z µθ +cαθ  exp −i L (τ) dτ + O4α t g2/τ  . G˜ θ f SB U˜ = exp{−φS + h.c} (E17a) µθ −cαθ + (E8c) ˜ ! cos(|φ|) − sin(|φ|)eiµθ tf Ω = ˜ (E17b) We made use of the formal Magnus expansion of the sin(|φ|)e−iµθ tf Ω cos(|φ|) superoperator in going from line (E8b) to line (E8c), cos(|φ|) − sin(|φ|) = U † (µ ) U (µ ) . (E17c) wherein the lowest order term in the time-ordered propa- A θ sin(|φ|) cos(|φ|) A θ gator is the argument of line (E8b), and the order term is a commutator of the two Liouville operators. As long as If we rotate the unitary back into the adiabatic frame, it 2 αθtf g /τSB  1, which requires either weak coupling or becomes Eq. (33). small α (as confirmed for the two examples mentioned θ As a final remark, in the limit of αθ → 0, the unitary in AppendixC), we may ignore the dissipative part due non-adiabatic transition has the same effects as an ideal to LR˜. After dropping this term we are left with beam splitter:

 Z µθ +cαθ  √ ˜ ˜ † 2 exp −i LG˜ (τ) dτ · ρ = UρU , (E9) U → (I − iY ) . (E18) µθ −cαθ 2 17

Appendix F: Derivation of Eq. (52) To obtain the second equality above, we explicitly carried out the integration Taking the partial derivative of Eq. (43b) with respect τ µ++s Z f ∂X(τ 0) Z θ d to sp, we have dτ 0 = X¯ 0(µ+) 1 dτ 0 + θ ∂sp sp=µ ∂F Z τf  ∂Γ ∂G  τ θ τ g = dτ G + Γ (F1a) ¯ 0 + + = X (µθ )(sd + µθ − τ) . (F6) ∂sp + ∂sp ∂sp µθ Z τf  Z τf  0 ∂Γ ∂X 0 Using Eq. (44a) and noticing that X(τ ) is constant for = dτ G − Γ dτ . (F1b) τ 0 ∈ [µ+, τ] with τ ∈ [µ+, µ+ + s ] (the pausing region), µ+ ∂sp τ ∂sp θ θ θ d θ we have: Denoting either Γ(τ) or X(τ) by F (τ), their partial ( ! ) Z τ Z 1+sd derivatives with respect to sp are given by (for a proof 0 0 G(τ) + = exp − X(τ ) dτ see AppendixG): sp=µθ + + µθ µθ   ¯ + + + 0 τ ≤ sp = exp X µθ τ − µθ G(µθ ). (F7) ∂F (τ)  0 = F¯ (sp) sp < τ ≤ sp + sd (F2) ∂sp  Thus Eq. (F5) can be further simplified with a change of 0 sp + sd < τ ≤ 1 + sd + variable τ = sdx+µθ , upon which we arrive at Eq. (52a): ¯0 dF¯(s) where F (sp) = . This can also be thought as ∂F ds s=sp g = the chain rule + ∂sp sp=µθ ∂F (τ) dF¯(s) ∂s(τ) Z 1   = , (F3) + Sx¯ ¯0 + ¯ + ¯ 0 + sdG µθ dx e Γ µθ − sdΓ µθ X (µθ )(1 − x) , ∂sp ds ∂sp 0 where ∂s(τ) follows by differentiating Eq. (26) ¯ ¯ + ∂sp where S = sdX µθ .  0 τ ≤ s ∂s(τ)  p = 1 sp < τ ≤ sp + sd . (F4) 2. Derivation of Eq. (52b) ∂sp  0 sp + sd < τ ≤ 1 + sd ∂Fg Consider . The expression at this end point Let us consider the two end points of Eq. (F1b), ∂sp sp=1 ∂Fg ∂Fg can similarly be obtained by splitting the integral into + and . τf 1 1+sd ∂sp sp=µ ∂sp sp=1 R R R θ two parts + dτ = µ+ dτ + 1 dτ [recall, per Eq. (27), µθ θ that τf = 1 + sd]. The integrands of the first integral R 1 1. Derivation of Eq. (52a) + dτ satisfy: µθ

∂Fg ∂Γ(τ) Consider ∂s s =µ+ . This integral can be simplified = 0 (F8a) p p θ ∂Γ ∂X ∂sp sp=1 by realizing that both + (τ) and + (τ) ∂sp s =µ ∂sp s =µ τf 0 τf p θ p θ Z ∂X(τ ) Z 0 ¯ 0 0 are rectangular functions within the pausing region dτ = X (1) 1 dτ  + +  ∂Γ τ ∂sp sp=1 1 µθ , µθ + sd [Eq. (F2)]. As a result, ∂s µ+ (τ) and p θ ¯ 0 R τf ∂X 0 0 + = sdX (1) . (F8b) ∂s + (τ ) dτ are zero when τ > µθ + sd. The for- τ p µθ mer follows directly from the definition and the latter is Eq. (F8b) follows from the same reasoning as Eq. (F6). because the integrand is zero in the entire region of in- Eq. (F8a) follows from the chain rule applied to tegration. This allows us to change the integration limit Eq. (45b), which gives ∂s(τ) = 0 (Eq. (F4)) as an + ∂sp from τf to µθ + sd in equation (F1b) and write sp=1 overall prefactor since the integration over τ goes up to ∂F g = τ = 1. + ∂sp sp=µθ Again using Eqs. (F6) and (F4), the integrands of the 1+s µ++s τ R d Z θ d  ∂Γ Z f ∂X  second integral dτ satisfy: dτ G(τ) − Γ dτ 0 = 1 + + + µ ∂sp µθ τ ∂sp µθ θ ∂Γ(τ) 0 + ¯ µ +s = Γ (1) (F9a) Z θ d  ∂s s =1 dτ G(τ) Γ¯0µ+ p p θ Z τf 0 Z τf µ+ ∂X(τ ) θ dτ 0 = X¯ 0(1) 1 dτ 0  ∂s s =1 ¯ + ¯ 0 + + τ p p τ − Γ µθ X (µθ )(sd + µθ − τ) . (F5) 0 = X¯ (1)(sd + 1 − τ) . (F9b) 18

Combining these results into Eq. (F1b), the final expres- which can be proved with f(1) = 0 and sion becomes Eq. (52b): W (−e−1/x/x) 1 − x f 0(x) = 1 + −1 (H2a) Z 1 1 + W (−e−1/x/x) x2 ∂Fg ¯ 0 −1 = −sdX (1) dτ G(τ)Γ(τ) 2 ∂sp + |1 + W−1|x − |W−1|x + |W−1| sp=1 µθ = 2 (H2b) Z 1+sd |1 + W−1|x ¯0 ¯ ¯ 0  + dτ G(τ) Γ (1) − Γ(1)X (1)(sd + 1 − τ) . > 0 ∀x > 1 . (H2c) 1 Line (H2a) is obtained using the derivative formula dW (z) W (z) Appendix G: Proof of Eq. (F2) = (H3) dz z(1 + W (z)) To prove Eq. (F2), we start from the definition of the and line (H2b) comes from Eq. (62). To prove line (H2c), partial derivative: we consider the quadratic function in the numerator of line (H2b), whose roots are: ∂F F (τ, s + ∆, s ) − F (τ, s , s ) = lim p d p d . (G1) ∂sp ∆→0 ∆ q 2 |W−1| ± −3|W−1| + 4|W−1| x± = − . (H4) From Fig. 10, we see that in the limit ∆ → 0 2|1 + W−1|

 For |W−1| > 4/3, the discriminant is smaller than zero. F (s(τ)) τ < sp  For 1 < |W−1| ≤ 4/3, the discriminant lies within [0, 1) F (τ, sp+∆, sd) = F (sp + ∆) sp < τ < sp + sd , so both roots are negative. In both cases, the quadratic  F (s(τ)) sp + sd < τ < 1 + sd function is positive for x > 1, which leads to line (H2c). (G2) so Appendix I: Demonstration that   1 ∂F F (sp + ∆) − F (sp) 0 = lim = F (s = sp) (G3) ¯ ¯ 0 ∂sp ∆→0 ∆ To show that   1 for  ≡ Γ(1)βΩ (1) [Eq. (73)], |Γ¯0(1)|(1+eβΩ(1)¯ ) we start with the expression for Γ(¯ s) [Eq. (45b)]. Because for sp < τ < sp + sd and zero elsewhere. P 01 2 α Sα (s) = 1 for the system-bath coupling operators in our model Eq. (30), we obtain

Γ(¯ s) = Γ[¯ Ω(¯ s)] = (1 + sd)tf γ[Ω(¯ s)] , (I1) where we have made the gap dependence of Γ(¯ s) explicit. Next,  can be simplified as

Ω¯ 0(1)Γ(1)¯ β  = ¯ (I2a) Γ¯0(1) 1 + eβΩ(1) γ[Ω(1)]¯ β = , (I2b) dγ ¯ 1 + eβΩ(1)¯ | dΩ¯ [Ω(1)]| where we used the chain rule to write Γ¯0(s) = (1 + ¯ 0 dγ sd)tf Ω (s) dΩ¯ . Any spectral density with an exponential high- −ω/ω FIG. 10. Graphical proof of Eq. (F2). The partial derivative frequency cutoff e c , such as the Ohmic bath is obtained in the limit of ∆ → 0. [Eq. (31)], will result in

γ[Ω(1)]¯ ∼ ω for ω > ω . (I3) dγ ¯ c c | dΩ¯ [Ω(1)]| Appendix H: Proof of Eq. (65) Thus, as long as Ω(1)¯ > ωc, 1/β we find that  is expo- nentially small in the final energy gap Ω(1).¯ To prove Eq. (65) we use We plot log10  for different temperatures and cutoff frequencies for the Ohmic case in Fig. 11, and confirm −1/x f(x) = x + W−1(−e /x) > 0 ∀x > 1 , (H1) that for reasonable parameters indeed   1. 19

This argument fails if Γ¯0(s∗) = 0 for s∗ ∈ [µ+, 1]. For θ 0.5 such cases, we only need to shift the end point from 1 10 ∗ to s . Then the optimal pausing position will be in the 0 + ∗ 0.0 interval [µθ , s ]. 10

-0.5 -2.5 10

-1.0 10 -5.0

-1.5 10 -7.5

-2.0 10 6 9 12 15 18

FIG. 11. Heat map of log10  for different ωc and T values for the Ohmic bath spectrum [Eq. (31)], covering the entire parameter region shown in Fig.6. The gap Ω(1)¯ ≈ 1.88(GHz) is chosen in accordance with parameters used in Fig.1. The maximum of  appears near the line ωc = 2πΩ(1).¯

[1] Tadashi Kadowaki and Hidetoshi Nishimori, “Quantum E. Ladizinsky, N. Ladizinsky, T. Lanting, R. Li, T. Med- annealing in the transverse Ising model,” Phys. Rev. E ina, R. Molavi, R. Neufeld, T. Oh, I. Pavlov, I. Perminov, 58, 5355 (1998). G. Poulin-Lamarre, C. Rich, A. Smirnov, L. Swenson, [2] Arnab Das and Bikas K. Chakrabarti, “Colloquium: N. Tsai, M. Volkmann, J. Whittaker, and J. Yao, “Phase Quantum annealing and analog quantum computation,” transitions in a programmable quantum spin glass simu- Rev. Mod. Phys. 80, 1061–1081 (2008). lator,” Science 361, 162 (2018). [3] Tameem Albash and Daniel A. Lidar, “Adiabatic quan- [9] Tameem Albash and Daniel A. Lidar, “Demonstration of tum computation,” Reviews of Modern Physics 90, a scaling advantage for a quantum annealer over simu- 015002 (2018). lated annealing,” Physical Review X 8, 031016– (2018). [4] Philipp Hauke, Helmut G Katzgraber, Wolfgang Lech- [10] Salvatore Mandr`aand Helmut G Katzgraber, “A decep- ner, Hidetoshi Nishimori, and William D Oliver, “Per- tive step towards quantum speedup detection,” Quantum spectives of quantum annealing: Methods and implemen- Sci. Technol. 3, 04LT01 (2018). tations,” Reports on Progress in Physics (2020). [11] Alex Mott, Joshua Job, Jean-Roch Vlimant, Daniel Li- [5] Troels F. Rønnow, Zhihui Wang, Joshua Job, Sergio dar, and Maria Spiropulu, “Solving a higgs optimization Boixo, Sergei V. Isakov, David Wecker, John M. Mar- problem with quantum annealing for machine learning,” tinis, Daniel A. Lidar, and Matthias Troyer, “Defining Nature 550, 375 EP – (2017). and detecting quantum speedup,” Science 345, 420–424 [12] Kristen L Pudenz, Tameem Albash, and Daniel A Li- (2014). dar, “Error-corrected quantum annealing with hundreds [6] Vasil S. Denchev, Sergio Boixo, Sergei V. Isakov, Nan of qubits,” Nat. Commun. 5, 3243 (2014). Ding, Ryan Babbush, Vadim Smelyanskiy, John Marti- [13] Walter Vinci, Tameem Albash, and Daniel A Lidar, nis, and Hartmut Neven, “What is the computational “Nested quantum annealing correction,” npj Quant. Inf. value of finite-range tunneling?” Phys. Rev. X 6, 031015 2, 16017 (2016). (2016). [14] Walter Vinci and Daniel A. Lidar, “Scalable effective- [7] Andrew D. King, Juan Carrasquilla, Jack Raymond, Isil temperature reduction for quantum annealers via nested Ozfidan, Evgeny Andriyash, Andrew Berkley, Mauricio quantum annealing correction,” Physical Review A 97, Reis, Trevor Lanting, Richard Harris, Fabio Altomare, 022308– (2018). Kelly Boothby, Paul I. Bunyk, Colin Enderud, Alexandre [15] Adam Pearson, Anurag Mishra, Itay Hen, and Daniel A. Fr´echette, Emile Hoskinson, Nicolas Ladizinsky, Travis Lidar, “Analog errors in quantum annealing: doom and Oh, Gabriel Poulin-Lamarre, Christopher Rich, Yuki hope,” npj Quantum Information 5, 107 (2019). Sato, Anatoly Yu. Smirnov, Loren J. Swenson, Mark H. [16] Trevor Lanting, Andrew D. King, Bram Evert, and Volkmann, Jed Whittaker, Jason Yao, Eric Ladizinsky, Emile Hoskinson, “Experimental demonstration of per- Mark W. Johnson, Jeremy Hilton, and Mohammad H. turbative anticrossing mitigation using nonuniform Amin, “Observation of topological phenomena in a pro- driver hamiltonians,” Physical Review A 96, 042322– grammable lattice of 1,800 qubits,” Nature 560, 456–460 (2017). (2018). [17] Juan I. Adame and Peter L. McMahon, “Inhomogeneous [8] R. Harris, Y. Sato, A. J. Berkley, M. Reis, F. Al- driving in quantum annealers can result in orders-of- tomare, M. H. Amin, K. Boothby, P. Bunyk, C. Deng, magnitude improvements in performance,” Quantum Sci- C. Enderud, S. Huang, E. Hoskinson, M. W. Johnson, ence and Technology 5, 035011 (2020). 20

[18] Ting-Jui Hsu, “Quantum annealing with anneal path dar, “A double-slit proposal for quantum annealing,” npj control: Application to 2-sat problems with known en- Quantum Information 5, 51 (2019). ergy landscapes,” Communications in Computational [37] Walter Vinci and Daniel A. Lidar, “Non-stoquastic Physics 26, 928–946 (2019). Hamiltonians in quantum annealing via geometric [19] Sheir Yarkoni, Hao Wang, Aske Plaat, and Thomas phases,” npj Quantum Information 3, 38 (2017). B¨ack, “Boosting quantum annealing performance us- [38] A. T. Rezakhani, D. F. Abasto, D. A. Lidar, and P. Za- ing evolution strategies for annealing offsets tuning,” in nardi, “Intrinsic geometry of quantum adiabatic evolu- Quantum Technology and Optimization Problems, edited tion and quantum phase transitions,” Physical Review A by Sebastian Feld and Claudia Linnhoff-Popien (Springer 82, 012321 (2010). International Publishing, Cham, 2019) pp. 157–168. [39] Sabine Jansen, Mary-Beth Ruskai, and Ruedi Seiler, [20] J´er´emie Roland and Nicolas J. Cerf, “Quantum search “Bounds for the adiabatic approximation with appli- by local adiabatic evolution,” Phys. Rev. A 65, 042308– cations to quantum computation,” J. Math. Phys. 48, (2002). 102111 (2007). [21] Jeffrey Marshall, Davide Venturelli, Itay Hen, and [40] Walter Vinci and Daniel A. Lidar, “Non-stoquastic Eleanor G. Rieffel, “Power of Pausing: Advancing Un- hamiltonians in quantum annealing via geometric derstanding of Thermalization in Experimental Quantum phases,” npj Quant. Inf. 3, 38 (2017). Annealers,” Physical Review Applied 11, 044083 (2019). [41] Sergey Bravyi, David P. DiVincenzo, and Daniel Loss, [22] D-Wave Systems Inc., “The D-Wave 2000Q Quantum “Schrieffer–wolff transformation for quantum many-body Computer Technology Overview,” (2018). systems,” Annals of Physics 326, 2793 – 2826 (2011). [23] Davide Venturelli and Alexei Kondratyev, “Reverse [42] Gioele Consani and Paul A Warburton, “Effective hamil- quantum annealing approach to portfolio optimization tonians for interacting superconducting qubits: local ba- problems,” Quantum Machine Intelligence 1, 17–30 sis reduction and the schrieffer–wolff transformation,” (2019). New J. of Phys. 22, 053040 (2020). [24] Walter Vinci, Lorenzo Buffoni, Hossein Sadeghi, Amir [43] Daniel A. Lidar, Ali T. Rezakhani, and Alioscia Hamma, Khoshaman, Evgeny Andriyash, and Mohammad H. “Adiabatic approximation with exponential accuracy for Amin, “A path towards quantum advantage in train- many-body systems and quantum computation,” Journal ing deep generative models with quantum annealers,” of Mathematical Physics 50, 102106 (2009). (2019), arXiv:1912.02119 [quant-ph]. [44] N. G. Dickson, M. W. Johnson, M. H. Amin, R. Harris, [25] G. Passarelli, V. Cataudella, and P. Lucignano, “Improv- F. Altomare, A. J. Berkley, P. Bunyk, J. Cai, E. M. Chap- ing the quantum annealing of the ferromagnetic p-spin ple, P. Chavez, F. Cioata, T. Cirip, P. deBuen, M. Drew- model through pausing,” arXiv:1902.06788 (2019). Brook, C. Enderud, S. Gildert, F. Hamze, J. P. Hilton, [26] Gianluca Passarelli, Ka-Wa Yip, Daniel A. Lidar, E. Hoskinson, K. Karimi, E. Ladizinsky, N. Ladizinsky, Hidetoshi Nishimori, and Procolo Lucignano, “Reverse T. Lanting, T. Mahon, R. Neufeld, T. Oh, I. Perminov, quantum annealing of the $p$-spin model with relax- C. Petroff, A. Przybysz, C. Rich, P. Spear, A. Tcaciuc, ation,” Physical Review A 101, 022331 (2020), publisher: M. C. Thom, E. Tolkacheva, S. Uchaikin, J. Wang, A. B. American Physical Society. Wilson, Z. Merali, and G. Rose, “Thermally assisted [27] H.-P. Breuer and F. Petruccione, The Theory of Open quantum annealing of a 16-qubit problem,” Nature Com- Quantum Systems (Oxford University Press, Oxford, munications 4, 1903 (2013). 2002). [45] Fei Yan, Simon Gustavsson, Archana Kamal, Jeffrey [28] Robert Alicki and K. Lendi, Quantum Dynamical Semi- Birenbaum, Adam P Sears, David Hover, Ted J. Gud- groups and Applications (Springer Science & Business mundsen, Danna Rosenberg, Gabriel Samach, S Weber, Media, 2007). Jonilyn L. Yoder, Terry P. Orlando, John Clarke, An- [29] Daniel A Lidar, “Lecture notes on the theory of open drew J. Kerman, and William D. Oliver, “The flux qubit quantum systems,” arXiv preprint arXiv:1902.00967 revisited to enhance coherence and reproducibility,” Na- (2019). ture Communications 7, 12964 EP – (2016). [30] Tameem Albash, Sergio Boixo, Daniel A Lidar, and [46] C. M. Quintana, Yu Chen, D. Sank, A. G. Petukhov, Paolo Zanardi, “Quantum adiabatic markovian master T. C. White, Dvir Kafri, B. Chiaro, A. Megrant, equations,” New J. of Phys. 14, 123016 (2012). R. Barends, B. Campbell, Z. Chen, A. Dunsworth, A. G. [31] Tameem Albash and Daniel A. Lidar, “Decoherence in Fowler, R. Graff, E. Jeffrey, J. Kelly, E. Lucero, J. Y. Mu- adiabatic quantum computation,” Physical Review A 91, tus, M. Neeley, C. Neill, P. J. J. O’Malley, P. Roushan, 062320 (2015). A. Shabani, V. N. Smelyanskiy, A. Vainsencher, J. Wen- [32] Anatoly Yu Smirnov and Mohammad H. Amin, “The- ner, H. Neven, and John M. Martinis, “Observation ory of open quantum dynamics with hybrid noise,” New of classical-quantum crossover of 1/f flux noise and its Journal of Physics 20, 103037 (2018). paramagnetic temperature dependence,” Physical Re- [33] Roie Dann, Amikam Levy, and Ronnie Kosloff, “Time- view Letters 118, 057702– (2017). dependent markovian quantum master equation,” Phys- [47] S. Novikov, R. Hinkey, S. Disseler, J. I. Basham, T. Al- ical Review A 98, 052129– (2018). bash, A. Risinger, D. Ferguson, D. A. Lidar, and K. M. [34] Evgeny Mozgunov and Daniel Lidar, “Completely posi- Zick, “Exploring more-coherent quantum annealing,” in tive master equation for arbitrary driving and small level 2018 IEEE International Conference on Rebooting Com- spacing,” Quantum 4, 227 (2020). puting (ICRC) (2018) pp. 1–7. [35] Frederik Nathan and Mark S. Rudner, “Universal lind- [48] Mostafa Khezri, Jeffrey A. Grover, James I. Basham, blad equation for open quantum systems,” (2020), Steven M. Disseler, Huo Chen, Sergey Novikov, Ken- arXiv:2004.01469 [cond-mat.mes-hall]. neth M. Zick, and Daniel A. Lidar, “Anneal-path correc- [36] Humberto Munoz-Bauza, Huo Chen, and Daniel Li- tion in flux qubits,” (2020), arXiv:2002.11217 [quant-ph]. 21

[49] William D. Oliver, Yang Yu, Janice C. Lee, Karl K. chine?” arXiv:1401.7087 (2014). Berggren, Leonid S. Levitov, and Terry P. Orlando, [54] Jeffrey Marshall and Tameem Albash, (2020), private “Mach-zehnder interferometry in a strongly driven su- communication. perconducting qubit,” Science 310, 1653 (2005). [55] J. Bezanson, A. Edelman, S. Karpinski, and V. Shah, [50] Eric W. Weisstein, “Lambert W-Function.” From “Julia: A Fresh Approach to Numerical Computing,” MathWorld–A Wolfram Web Resource. SIAM Review 59, 65–98 (2017). [51] Boris Altshuler, Hari Krovi, and J´er´emieRoland, “An- [56] Christopher Rackauckas and Qing Nie, “DifferentialE- derson localization makes adiabatic quantum optimiza- quations.jl – A Performant and Feature-Rich Ecosystem tion fail,” Proceedings of the National Academy of Sci- for Solving Differential Equations in Julia,” Journal of ences 107, 12446–12450 (2010). Open Research Software 5 (2017), 10.5334/jors.151. [52] Zoe Gonzalez Izquierdo, Shon Grabbe, Stuart Had- [57] Alan L. Andrew and Roger C. E. Tan, “Computation of field, Jeffrey Marshall, Zhihui Wang, and Eleanor Ri- Derivatives of Repeated Eigenvalues and the Correspond- effel, “Ferromagnetically shifting the power of pausing,” ing Eigenvectors of Symmetric Matrix Pencils,” SIAM arXiv:2006.08526 (2020). Journal on Matrix Analysis and Applications 20, 78–100 [53] Seung Woo Shin, Graeme Smith, John A. Smolin, and (1998). Umesh Vazirani, “How “quantum” is the D-Wave ma-