CS682 Project Report Hallgren’s Efficient Quantum for Solving Pell’s Equation

by Ashish Dwivedi (17111261) under the guidance of Prof Rajat Mittal

November 15, 2017 Abstract

In this project we aim to study an excellent result on the quantum computation model. Hallgren in 2002 [Hal07] showed that a seemingly difficult problem in classical computa- tion model, solving Pell’s equation, is efficiently solvable in quantum computation model. This project explains the idea of Hallgren’s quantum algorithm to solve Pell’s equation. Contents

0.1 Introduction...... 1 0.2 Background...... 1 0.3 Hallgren’s periodic function...... 3 0.4 Quantum Algorithm to find irrational period on R ...... 4 0.4.1 Discretization of the periodic function...... 4 0.4.2 Quantum Algorithm...... 5 0.4.3 Classical Post Processing...... 7 0.5 Summary...... 8

0.1 Introduction

In this project we study one of those problems which have no known efficient solution on the classical computation model and are assumed hard to be solved efficiently, but in recent years efficient quantum have been found for them. One of these problems is the problem of finding solution to Pell’s equation. Pell’s equation are equations of the form x2 − dy2 = 1, where d is some given non- square positive integer and x and y are indeterminate. Clearly, (1, 0) is a solution of this equation which are called trivial solutions. We want some non-trivial integer solutions satisfying these equations. There is no efficient algorithm known for solving such equations on classical compu- tation model. But in 2002, Sean Hallgren [Hal07] gave an algorithm on quantum compu- tation model which takes time only polynomial in input size (log d). Our main reference is the paper by Hallgren ”Polynomial-Time Quantum Algorithms for Pell’s Equation and the Principal Ideal Problem”. We also refer to an excellent article by R. Jozsa [Joz03] which contains all necessary background for Hallgren’s paper and a classical survey article by H. W. Lenstra [LJ02].

0.2 Background

Given a square free integer d, the equation x2 − dy2 = 1 is known as Pell’s equation. We look for its non-trivial positive integer solutions. This equation name has nothing to do with John Pell (1611-85). It was a mistake by Euler (1707-1783) who mistakenly attributed to Pell for the work of Brounckner (1620-1684). This problem has a 2000 yr long history (Greece and India). Brahmgupta (approx. 1000 AD) gave a method to solve Pell’s equation known as ”Chakravala Method”. At the time of Pythagoras

1 these equations were used to approximate the of 2. Later Lagrange (1736- 1813) proved that this equation has a solution for every such d and in fact there are infinitely√ many solutions. He also showed that only the smallest solution to Pell’s equation x1 + y1 d can give all other solutions to Pell’s equation. Since the smallest solution generates all other solutions we only need to find the smallest solution for these equations. The problem is that fundamental solution for

different ds are distributed very unevenly and can be exponentially√ large. In fact, Lenstra [LJ02] mentioned in the survey that there is an upper bound O(e d) on the magnitude of the smallest√ solution to the Pell’s equation. So even to write down the smallest solution takes O( d) bits which is exponential in size of the solution. √ So, we introduce a term, called Regulator, defined as R := ln(x1 + y1 d). Now, our

task is to find the irrational number R to sufficient amount of precision.√ The survey [LJ02] mentions that the best known classical algorithm to find R takes O(e logdpoly(n)) time assuming GRH. The idea of Hallgren’s algorithm has two parts:

• Classical Part: Setup a periodic function h(x), x ∈ R s.t. Period of h is the regulator R. h(x) should be efficiently computable, i.e. for x accurate to n decimal digit, computing h(x) should take poly(log x, log d, n) time. This task is totally classical, which reduces the task of solving Pell’s equation to finding period (irrational) of an efficiently computable function. • The quantum part generalises the standard period finding algorithm to finding irrational periods. To define such a periodic function, we need to have some background in algebraic . Since putting everything here is not possible we refer reader to see the article by Jozsa [Joz03] which contains sufficient background. We will try to give informal definition of algebraic objects as we will use them and will also state their properties which will be required in construction of the periodic√ function h. √ The quadratic number field is the set Q[ d] = {a + b d | a, b ∈ Q}. The√ ring of polynomials Z[x] is the set of all integer polynomials. Algebraic integers α ∈ Q[ d]√ are roots of an integer monic polynomial f ∈ Z[x]. O is set of all algebraic integers of Q[ √d]. Units of O are elements in O that they have multiplicative inverses. Any element x+y d 2 2 of O is a unit iff x − dy = ±1. Smallest unit 0 > 0 is called fundamental unit. It is the k property of the units that any unit is  is power of the fundamental unit ±0 . Hence, to solve Pell’s equation we only√ need to find 0. Take R = ln(0). Ideals of O are I ⊂ Q[ d]√ s.t I.O = I. An integral ideal I is a subset of O.A fractional ideal I is a subset Q[ d]. An ideal I is called principal ideal if I ⊂ γO.A property of two principal ideals αO and γO is that αO=γO iff γ = α for a unit . This follows by the fact that O=O. We can use this property of principal ideals to define a periodic function whose period is R. Let P be the set of all principal ideals. Define a periodic function h : R → P as h(x) = exO

Here principal ideals γO are being considered as function of x = ln(γ). Clearly the period of this function is R = ln(0).

2 But we are not done. We want our function h to be efficiently computable. The way h has been defined has some problems.We will need to check if two ideals are same. The arithmetic operations needed to check ex = e(x+R) would require exponential time. Also we have infinitely many distinct ideals in O for every α, so identification of ideal αO may need full precision of α. We need to have some effective notion of checking equality (almost) for two ideals. Two circumvent these difficulties we will use the concept of reduced ideals. Again we will not give complicated definition of reduced ideals. We will just state some properties of these ideals and some algebraic operations defined on them. Assuming all these we will be able to understand how function h is defined and why it will be efficiently computable and periodic with period R. Reduced ideals are special principal ideals. The set of all reduced ideals RI = {J ,J ,...,J } of O is a finite set whose size is exponential in log d. Each reduced 0 1 k0−1 √ ideal has a definition completely√ depending on pair of integers 0 ≥ a, b < D, where D is the discriminant of Q[ d]. Hence reduced ideals have poly(log d) size description, which is great. The set RI forms a principal cycle of length R, the regulator. There is a distance function defined for reduced ideals δ. If I = αO then δ(I) = ln|α| mod R. δ(I) is distance of I from O. O has distance 0. δ is also defined for two ideals Ji = αiO and |αi| Jk = αkO as δ(Ji,Jk) = ln( ) mod R. |αk| For a reduced ideal I = Z + γZ, define ρ(I) = 1/γI. ρ acts as a movement operator for reduced ideals. ρ(I) is again principal for principal ideal I. If J is reduced then ρ(J) is next reduced ideal in the principal cycle. ρ(I) repeatedly gives reduced ideal I0 in poly number of steps. Similar to √ρ an operator√ σ is defined which is like step back operator. σ(I) conjugates I, i.e. takes D → − D. It comes out that ρ−1 = σρσ. So using ρ and σ we can move back and forth among reduced ideals. Distance function has more properties. Distance function δ for I1 = γI2 is defined as x δ(I1,I2) = ln(γ) mod R. Distance from O of e O is δ(O) = x mod R. It allows ordering of ideals along number line. δ(I, ρ(I)) is efficiently computable. spacing of Ji and Ji+1 has lower bound 3/32D and upper bound 1/2ln(D). Distance between Ji and Ji+2 is ≥ ln2. For non-reduced I, |δ(I,Ired)| ≤ lnD, and Ired ∈ {Jk−1,Jk,Jk+1}. We also have jump operator to cover exponential distance between two ideals in poly time. Jump operator is ∗. Jumping is defined using product of two ideals δ(I1.I2) = δ(I1) + δ(I2) without mod R.

0.3 Hallgren’s periodic function

Now we will define our periodic function h whose period is regulator R. We define h : R → RI × R as, h(x) = (Ix, xˆ − δ(Ix))

Where Ix is the nearest reduced principal ideal to the left of x andx ˆ = x mod R so 0 ≤ xˆ < R. Again since Ix also a principal ideal so by our previous discussion, h(x) is periodic with period R. We can see that h(x) is one to one in one period range. In fact we introducedx ˆ − δ(Ix) in the definition of h to make it one to one within its one period. Now we can see that our function h is efficiently computable, the way reduced ideals are defined. Since h(x) outputs Ix and we need to get Ix for given x. We can’t use

3 ρ to move one step at a time among reduced ideals to reach Ix because though finite there are exponentially many of them. So, we will start from O and use jump function repeatedly to move rapidly and cover exponential distance in poly time. We terminate at Nth iteration when δ(IN+1) first exceeds x. Now the ideal IN may not be reduced (the only problem with jump functions). But we know that after applying ρ constant number of times to IN we can get back to some reduced ideal and then applying ρ will keep us in principal cycle. Using these ideas and keep doing repeated squaring we will ultimately reach to Ix. While doing all these we keep calculating distance from O using bound on the distance so we will be able to calculatex ˆ − δ(Ix) to desired accuracy.

0.4 Quantum Algorithm to find irrational period on R

In this section we will see how the idea of Shor for finding period of functions over Z extends to finding irrational period of functions over R.

0.4.1 Discretization of the periodic function

To work on a function f : R → R over reals we first need to discretize the function to make it work on Z or Q s.t. it contains some information about the period of the function ˆ 1 f. A way of this discretization is to form a new function f : Z → N Z for a suitably chosen N s.t. ˆ f(k) = bf(k/N)cN where bxcN denotes that the value x is rounded down to integer multiple of 1/N. We want our function f to contain at least some approximate information about the period of f. But this is not guaranteed because fˆ may vary arbitrarily in the range [0, 1/N] so evaluating fˆ this way may not get any idea about the period of the function f. For this, we introduce a term ”weak periodicity”.

Definition 0.1. A function f : Z → X is called ”weakly periodic” or ”pseudo periodic” at offset k with period S ∈ R if for all l ∈ Z either f(k) = f(k + blSc) or f(k) = f(k + dlSe). We write f(k + [lS]) to denote any of the two possible options.

The idea behind discretization and defining pseudo-periodicity is that we apply the original period finding algorithm to pseudo-periodic functions and we will see that it works with good probability. We just need to be a little careful in calculations and apply a different idea in classical post processing than used in original period finding algorithm. We discretize the hallgren’s periodic function h as hˆ : Z → I × R: ˆ ˆ h(k) = (Ik/N , bk/N − δ(Ik/N ))cN

Our new discretized function has the following properties:

• hˆ is one to one within its approximate period, 0 ≤ kbNRc. This can be seen easily by the fact that for distinct k and l, |k/N − l/N| ≥ 1/N.

4 • Since h was efficiently computable (O(poly(log d))), hˆ(k) is also efficiently com- putable with an overhead of O(log N) and O(log k). (We will pick N and k only poly(d)).

• hˆ is pseudo-periodic with period NR, which is what we were hoping while discretiz- ing h. For this we need to pick N sufficiently large, i.e. N > log d/dmin, where dmin = 3/32D is a lower bound on the distance between two reduced ideals. Then 1/N is sufficiently small to make hˆ pseudo-periodic. Note that hˆ is not pseudo-periodic for all k ≤ bNRc, but for only those k s.t. k/N is not a nearest multiple of 1/N to the left of some reduced ideal. But bad ks form a small fraction 1/ log d of total values and that suffices for our purpose. This property has a simple proof given in [Joz03], we refer to this article for the proof.

0.4.2 Quantum Algorithm In this section we will describe a quantum algorithm to find the period S of a given pseudo-periodic function f : Z → X. We will assume that function f on some k is efficiently computable in time poly(log k, log S) and f is one to one within its period (0 ≤ k ≤ bSc). We will also assume that there is an efficient verification procedure for the period of f. It means if we are given some integer m, the procedure efficiently checks if m is within one of some integer multiple of period or not. For such a function f we show that our quantum algorithm outputs an integer a within one of the period S in time poly(log S) with high probability Ω(1/poly(log S)). Note that if we take f = hˆ, it is clear by the properties of hˆ that f follows the first two requirements for the quantum algorithm. We will see in next section that f also posses an efficient verification procedure. For now, we concentrate only on the quantum algorithm. The Algorithm performs the following steps:

1. Choose an integer q ≥ 3S2.

2. Compute the pseudo-periodic function f on a uniform superposition of dimension q.

3. Measure the second register to get another nearly uniform superposition.

4. Apply the quantum fourier transform over Zq 5. Measure the first register to get some value c and repeat this all again to get another value d.

6. Compute the of c/d.

7. For each convergent cn/dn, compute m = bcnq/ce and use the verification procedure to check if m is near to a multiple of the period.

8. Output the smallest of all m from the last step passed by the verification procedure.

5 The algorithm is similar to the original period finding algorithm over Z. It differs mainly on steps 6 − 8 which are part of classical post processing and will be described in next section. In this section how steps 1 − 5 work for pseudo-periodic function f and gives us specific c and d with good probability and efficiently. In step 1, since we don’t know the period S, how will we pick such a q? For this, instead of S we calculate q ≥ 3M 2 for some upper bound M of S. We can get this upper bound efficiently by starting at M = 2 and double M repeatedly until it is large enough for period finding to work. We also take q as power of two. We start we the superposition,

q−1 1 X √ |mi |f(m)i q m=0 When we measure the second register we get new superposition consistent with f(k),

p−1 1 X |ψ i = √ |k + [lS]i 0 p l=0 Where p is near about q/S, i.e. pS is largest number ≤ S. (Since f is pseudo-periodic for a big fraction of total values we don’t need to worry about exact p for k). Pq−1 On applying Fourier transform of dimension q on |ψ0i we get, j=0 aj |ji, where

p−1 1 X a = √ ωj[lS] j pq l=0

2 Fourier transform removes the offset k. We are interested to know the probability |aj| . We have [lS] = lS + δl, where −1 < δl < 1, So

p−1 p−1 X X ωj[lS] = ωjlSωjδl l=0 l=0

When we take δl = 0 the sum

p−1 X ωpjS − 1 sin(πjps/q) ωjlS = = ω(p−1)jS/2 ωjS − 1 sin(πjS/q) l=0 hence

p−1 1 X 1 sin2(πjps/q) |√ ωjlS|2 = pq pq sin2(πjS/q) l=0 1 We are interested in j = bkq/Se = kq/S ±  where  = 2 . Putting this j in above 2 1 4 2 2 formula and using the inequality x − 3 x ≤ sin x ≤ x we get,

p−1 1 X |√ ωjlS|2 = Ω(1/S) pq l=0

6 √ or |aj| = Ω(1/ S) for j of the form bkq/Se Pp−1 jlS jδl Pp−1 jlS Now similar way if we calculate the deviation in amplitude | l=0 ω ω − l=0 ω | πjp we get that this quantity is ≤ 2q . If we restrict to only those j < q/ log S (we have already assumed j of the form bkq/Se) then we get,

p−1 p−1 1 X 1 X √ 1 |a | = |√ ωj[lS]| = |√ ωjlS|− amplitude difference = Ω(1/ S)−O(√ ) j pq pq l=0 l=0 S log S √ = Ω(1/ S) 2 or |aj| = Ω(1/S). Because there are S/ log S such j (following two constraints), we get such js with high probability (Ω(1/S × S/ log S) = Ω(1/ log S) ) on measuring the first register. Also with probability 1/ log S we can pick a j < q/ log S so it will not affect probability badly. The reference for the above amplitude and probability calculation is the nice lecture notes on quantum algorithms by Andrew Childs (2008 version). This quite simplified the calculation. We run steps 1-5 twice and get two such values of j, c = bkq/Se and d = blq/Se. In the next section we will see, how we can extract integer part of S using c and d.

0.4.3 Classical Post Processing We get two integers c = bkq/Se and d = blq/Se s.t. (k, l) = 1 by steps 1-5 of the algorithm. Such c and d for co prime k and l can be found with good probability by running the algorithm twice (using prime number theorem with prob 1/poly(log S)). Now the question is how will we extract S from c and d? A result in continued fraction expansion (refer to standard number theory text like [HW79]) says that the fraction k/l appears as convergent in the continued fraction ex- pansion of x if |x − k/l| ≤ 1/(2l2). Claim 0.2. If q ≥ 3S2 then k/l is a convergent in the continued fraction expansion of c/d.

2 Proof. We just need to show that |c/d − k/l| ≤ 1/(2l ). Take, c = kq/S + 1 and d = lq/S + 2 we get,

kq/S + 1 k S(1l − 2k) S |c/d − k/l| =| − |= | 2 | ≤ lq/S + 2 l l q + 2Sl lq − S/2

2 2 using |1|, |2| = 1/2. If we pick q ≥ 3S we get |c/d − k/l| ≤ 1/(2l ).

Now for each convergent cn/dn we check if cn = k by calculating m = bcnq/ce. We can see easily that if cn = k then corresponding m will be within 1 of the period S. For other cn we check using verification procedure that if corresponding m is within 1 of some integer multiple of S or not, if it is then we collect such ms and return the smallest one. With the probability 1/poly(log S) this smallest m will come out to be within 1 of period S.

7 As we promised, we prove that there is a verification procedure for h. Given integer m we want to check if |jR − m| < 1. We calculate reduced ideal Im to the left of the m. If m is within 1 of R then one of the ideals Im−4,Im−3,...,Im+3,Im+4 must be O because distance between two consecutive reduced ideals is > ln2 and 2ln2 > 1 so in ±4 consecutive ideals to Im, one of them must be O. It is direct to see that if we have an integer within 1 of the period (NR) of hˆ we can get integer within 1 of the period of h (R). That solves our problem.

0.5 Summary

I this report we saw how ideas from algebraic number theory have been used to define a periodic function h whose period encodes the solution to Pell’s equation x2 − dy2 = 1. Then to find the period of such a function having irrational period R, Hallgren used the concept of discretization and pseudo periodicity to modify the function h. Discretization and pseudo periodicity allowed him to apply similar procedure for period finding as used in Shor’s algorithm. This way, we get the logarithm of smallest solution to pell’s equation efficiently on quatum computation model.

8 Bibliography

[Hal07] Sean Hallgren. Polynomial-time quantum algorithms for pell’s equation and the principal ideal problem. Journal of the ACM (JACM), 54(1):4, 2007.

[HW79] Godfrey Harold Hardy and Edward Maitland Wright. An introduction to the theory of numbers. Oxford University Press, 1979.

[Joz03] Richard Jozsa. Notes on hallgren’s efficient quantum algorithm for solving pell’s equation. arXiv preprint quant-ph/0302134, 2003.

[LJ02] Hendrik W Lenstra Jr. Solving the pell equation. Notices of the AMS, 49(2):182– 192, 2002.

9