<<

Masaryk University Faculty of Informatics

Baillie-PSW

Master’s Thesis

Ondřej Krčma

Brno, Spring 2021

Masaryk University Faculty of Informatics

Baillie-PSW pseudoprimes

Master’s Thesis

Ondřej Krčma

Brno, Spring 2021

This is where a copy of the official signed thesis assignment and a copy ofthe Statement of an Author is located in the printed version of the document.

Declaration

Hereby I declare that this paper is my original authorial work, which I have worked out on my own. All sources, references, and literature used or excerpted during elaboration of this work are properly cited and listed in complete reference to the due source.

Ondřej Krčma

Advisor: Mgr. Marek Sýs, Ph.D.

i

Acknowledgements

I would like to thank Mgr. Marek Sýs, Ph.D. for his invaluable advice during my work on this thesis. Computational resources were supplied by the project "e-Infrastruktura CZ" (e-INFRA LM2018140) provided within the program Projects of Large Research, Development and Innovations Infrastructures.

iii Abstract

The goal of this thesis is to examine and look for a weaker version of the Baillie-PSW pseudoprimes. First we examine the Fermat and the Lucas pseudoprimes and the divisibility properties of the order and the rank of appearance. We present the known theory and complete some missing proofs and algorithms. We also examine the notion of admissibility, which is the main necessary condition for the sought pseudoprimes. In the last chapter, we apply the theory and attempt to find the pseudoprimes in several special cases.

iv Keywords primality tests, pseudoprimes, Baillie-PSW, PSW-challenge pseudo- primes, rank of appearance, admissibility

v

Contents

Introduction 1

1 The Fermat test 3 1.1 Order of a modulo n ...... 4 1.2 Fermat pseudoprimes to base 2 ...... 7

2 Lucas 9 2.1 Computing Lucas sequences ...... 10 2.2 Lucas pseudoprimes ...... 13 2.3 Rank of appearance of n ...... 15 2.4 Lucas pseudoprimes in the Fibonacci . . . . . 19

3 The Baillie-PSW test 21 3.1 Challenge pseudoprimes ...... 22 3.1.1 Admissibility ...... 23 3.1.2 Computing last prime ...... 28

4 Search for challenge psudoprimes 31 4.1 Small prime factors ...... 32 4.1.1 Admissibility from order and rank ...... 33 4.1.2 Full version of the algorithm ...... 34 − 4.2 Pseudoprimes of the form 2n 1 − 1 ...... 34 4.2.1 Primes with rank equal to a . . . . 35 4.2.2 Mersenne primes ...... 35 4.3 Two prime factors ...... 36 4.3.1 Admissible primes p with e(p) = 1 ...... 37 4.3.2 Computing the other prime ...... 37 4.4 Even of prime factors ...... 38 4.4.1 Case of p ≥ 245 ...... 38 4.4.2 Case of p < 245 ...... 39 4.5 Lattice based solutions ...... 40 4.5.1 Searching for the pseudoprimes ...... 40 4.5.2 Transformation of the problem ...... 41 4.5.3 Lattice for Fermat pseudoprimes ...... 42 4.5.4 Example for Fermat pseudoprimes ...... 43 4.5.5 Lattice for both Fermat and Lucas pseudoprimes 44

vii Conclusion 47

Bibliography 49

viii Introduction

Generating large random primes is a key component of cryptosystems such as RSA. However, deciding whether a random large (in the order of thousands of ) is prime or composite is computationally expensive. There are two basic kinds of primality tests: deterministic and probabilistic. Deterministic tests will give us a definitive answer on whether the input integer is prime or not, but they are often too slow. In general, probabilistic tests are fast, however some composite also pass through. These composite integers which pass some probabilistic are called pseudoprimes and the more pseudoprimes there are for a given test, the worse the test is. In this thesis we are looking for pseudoprimes for a weaker version of the Baillie-PSW test, because, so far, no pseudoprimes for this test have been found. The text of this thesis is split into four chapters. In the first chap- ter we introduce Fermat pseudoprimes and examine some of their properties, for example the order. In the second chapter we look at Lucas sequences and Lucas pseudoprimes. We briefly discuss how to compute Lucas sequences and then examine Lucas pseudoprimes and their properties, especially the rank of appearance. There are many similarities between the Fermat and the Lucas pseudoprimes, and the lemmas in the second chapter often parallel lemmas from the first chapter. In the third chapter we combine the two tests intothe Baillie-PSW test and examine PSW-challenge pseudoprimes, which are composite integers passing the weaker version of the test. We mostly focus on the notion of admissibility, which is a strong neces- sary condition, which all PSW-challenge pseudoprimes must satisfy. And finally, in the fourth chapter we describe our attempts tofindthe PSW-challenge pseudoprimes. The reader is expected to be familiar with basic algebraic concepts such as the order of a group element, modular or the Chi- nese Remainder Theorem. The first two chapters mostly cover the known theory, so a reader who is confident in their knowledge of Fer- mat and Lucas pseudoprimes may skip directly to the third chapter. However, we present some proofs and algorithms missing from the literature, and so we recommend starting from the beginning.

1

1 The Fermat test

One of the simplest primality tests is the Fermat test. Fermat’s little theorem gives the following necessary condition which all primes must satisfy.

Theorem 1.1 (Fermat’s little theorem). Let p be prime and a an integer such that a 6≡ 0 (mod p), then

ap−1 ≡ 1 (mod p).

To test whether given n is prime, we choose a coprime to n (for − example a = 2) and calculate an 1 mod n. If the resulting value is not 1, then we are sure that n is not prime. Whereas if the result is 1, then n may or may not be prime. For example 2340 ≡ 1 (mod 341), however 341 = 11 × 31 is a composite integer. Composite which pass a primality test such as the Fermat test are called pseudoprimes. There are many different primality tests and an integer maybe a relative to one test and not a pseudoprime relative to other tests. This gives us different types and definitions of pseu- doprimes, the first of which is the . Note that different authors may have slightly different definitions of theindi- vidual types of pseudoprimes. We will follow the terminology from [1].

Definition 1. An odd composite integer n is a (Fermat) pseudoprime to base a if an−1 ≡ 1 (mod n).

An integer n may be pseudoprime to some bases and not others, so if the test fails for one base, we may try another. Note that the set ∗ of bases to which n is pseudoprime forms a subgroup of Zn, so trying multiple bases one after another may lead to diminishing returns. The main problem however is that there exist Carmichael numbers, which are integers that are Fermat pseudoprimes to all possible bases. In 1994 Pomerance showed that there are infinitely many Carmichael numbers [2]. To get around this problem we may define stronger necessary conditions and stronger tests.

3 1. The Fermat test

Definition 2. An odd composite integer n is an to base a if gcd(a, n) = 1 and

n−1  a  a 2 ≡ (mod n), n a  where n is the Jacobi symbol. Clearly, an Euler pseudoprime is also Fermat pseudoprime. Be- cause of the multiplicative property of Jacobi symbols, the bases of an Euler pseudoprime still form a multiplicative group. However Lehmer showed that there are no integers which are Euler pseudoprimes to all possible bases [3] (i.e. there is no equivalent of Carmichael numbers for Euler pseudoprimes). Since the bases to which n is Euler pseudo- ∗ prime form a subgroup of Zn and n cannot be pseudoprime to all the bases, then n is pseudoprime to at most half of the possible bases. The test can be strengthened even further.

Definition 3. An odd composite integer n = d · 2s + 1, where d is odd, is a strong (Fermat) pseudoprime to base a if

ad ≡ 1 (mod n)

or r ad·2 ≡ −1 (mod n) for some 0 ≤ r < s.

A strong (Fermat) pseudoprime is both Euler and Fermat pseu- doprime. Since it’s an Euler pseudoprime, there are no equivalents of Carmichael numbers, but the bases no longer form a group. Unfortu- nately, there are still infinitely many strong pseudoprimes to any base [1].

1.1 Order of a modulo n

In the rest of this thesis we will focus mainly on the basic Fermat test (in the second and the third chapters in combination with the Lucas test). In this chapter we present some properties of Fermat pseudoprimes and the order of base a modulo n.

4 1. The Fermat test

Definition 4. Given an odd positive integer n and a base a, the order of a modulo n is the least positive integer k such that

ak ≡ 1 (mod n).

We will follow the notation of Pomerance, Selfridge and Wagstaff [1] and denote the order as la(n). The reason we do not follow the standard notation of ordn(a) is that in this context the order is a property of the modulus n and not of the base a.

k It is well known [4] that if a ≡ 1 (mod n), then la(n) | k. This means that if n is a Fermat pseudoprime, then la(n) | (n − 1). The order la(n) also divides the group order ϕ(n), where ϕ denotes Euler’s totient function. This leads to the following result. Lemma 1.2. Let p be a prime and n = pk be a Fermat pseudoprime to base a, then la(p) | p − 1 and la(p) | k − 1.

Proof. Since p is a prime, we immediately get la(p) | ϕ(p) = p − 1. And from that we get the equivalence 0 ≡ n − 1 = pk − 1 ≡ k − 1 (mod la(p)).

The previous lemma formulates a necessary condition for n being a Fermat pseudoprime, and if n has exactly two prime factors, then the condition is also sufficient, as demonstrated by the following lemma. Lemma 1.3. Let n = pq where p, q are distinct primes and a be a base, then n is a Fermat pseudoprime to base a if and only if

la(p) | q − 1 and la(q) | p − 1.

Proof. One direction of the proof is given by Lemma 1.2, so let’s assume la(p) | q − 1 and la(q) | p − 1. Then

n − 1 = (p − 1)(q − 1) + (p − 1) + (q − 1) ≡ 0 (mod la(p)), which means that an−1 ≡ 1 (mod p).

5 1. The Fermat test

Similarly an−1 ≡ 1 (mod q). and by Chinese Remainder Theorem

an−1 ≡ 1 (mod pq = n).

In the proof we used the fact that if p is prime, then la(p) | p − 1. Fermat pseudoprimes have the same property, so we can generalize the lemma in the following way.

Lemma 1.4. Let n = kl where k, l are either primes or pseudoprimes to base a and gcd(k, l) = 1. Then n is a Fermat pseudoprime to base a if and only if

la(k) | l − 1 and la(l) | k − 1.

Proof. The proof is analogous to the proof of Lemma 1.3.

l−1 l−1 If la(k) | l − 1, then a ≡ 1 (mod k) and so a ≥ k + 1. This inequality is not particularly strong, however, for example, if a = 2 − and n = pq where p < q are primes, we get p · 2p 1 ≥ pq + p ≥ n, and so to find all pseudoprimes with two prime factors, one of which is − p, we only have to check positive integers up to p · 2p 1. This method tells us, for example, that there are no Fermat pseudoprimes to base 2 with exactly two prime factors one of which is 3, 5 or 7. For 11 there’s exactly one such pseudoprime 341. As we’ve seen, the order la(n) gives us some necessary or even sufficient conditions for n being a pseudoprime, so the natural question is: how do we compute la(n)? In Handbook of Applied Cryptography [4] the authors give the following algorithm for computing the order of an element in a finite group. Later we will derive similar algorithm for the rank of appearance in Lucas sequences.

6 1. The Fermat test

Algorithm 1: Computing order la(n) [4]

Input: An odd positive integer n with prime factorization of the order of the αi group ϕ(n) = ∏ pi and an integer 2 ≤ a < n s.t. gcd(a, n) = 1. Output: Order of a modulo n. 1 t ← ϕ(n)

2 forall i do

αi 3 t ← t/pi t 4 a1 ← a mod n

5 while a1 6= 1 do

pi 6 a1 ← a1 mod n

7 t ← t · pi

8 return t

1.2 Fermat pseudoprimes to base 2

In the rest of this work we will consider Fermat pseudoprimes mostly to base 2, since that’s the base in the Baillie-PSW test. The follow- ing lemma gives a way of constructing infinite sequence of Fermat pseudoprimes.

Lemma 1.5. Let n be a positive integer such that

2n−1 ≡ 1 (mod n),

then n 2(2 −1)−1 ≡ 1 (mod 2n − 1).

− Proof. Let 2n 1 ≡ 1 (mod n), then there exists integer k such that − 2n 1 − 1 = kn.

n n−1 2(2 −1)−1 = 22(2 −1) = 22kn ≡ 12k ≡ 1 (mod 2n − 1)

7 1. The Fermat test

In other words, if n is prime or pseudoprime, then 2n − 1 is or pseudoprime. For Mersenne primes however, it holds that if 2n − 1 is prime, then n is also prime. This means that if n is pseudo- prime, then 2n − 1 is also pseudoprime. Lemma 1.6. Let p be an odd prime and n = pαk be a Fermat pseudoprime, then α 2p −1 ≡ 1 (mod pα).

α α α−1 Proof. Since p is a prime, l2(p ) | ϕ(p ) = p (p − 1). And because α α α p | n, it holds that l2(p ) | l2(n) | n − 1 = p k − 1. Which means that α p - l2(p ) and therefore

α α l2(p ) | p − 1 | p − 1.

2 A prime p is called if l2(p ) | p − 1. The proof of the previous lemma shows that, for pseudoprime n, if pα | n, then p has to be a Wieferich prime. The only two known Wieferich primes are 1093 and 3511 and there are no other Wieferich primes up to 1017 [5].

8 2 Lucas sequences

In this chapter we discuss Lucas sequences, their properties and pri- mality tests, which are analogous to the Fermat test. We start with the definition of Lucas sequences.

Definition 5. Let P, Q be integers such that D = P2 − 4Q 6= 0, then we define Lucas sequence of the first kind Un(P, Q) as

U0(P, Q) = 0, U1(P, Q) = 1, Un(P, Q) = PUn−1(P, Q) − QUn−2(P, Q)

and Lucas sequence of the second kind Vn(P, Q) as

V0(P, Q) = 2, V1(P, Q) = P, Vn(P, Q) = PVn−1(P, Q) − QVn−2(P, Q).

If P, Q are clear from the context or they are not relevant at the moment, we will omit them and write Un or Vn instead of Un(P, Q) or Vn(P, Q). Lucas sequences are sometimes defined equivalently [6] in the 2 following way. Given characteristic equation√ x − Px +√Q = 0, where 2 P+ D P− D D = P − 4Q 6= 0, with roots a = 2 and b = 2 , we define the Lucas sequences as

an − bn U = n a − b and n n Vn = a + b . We will always use Lucas sequence of the first kind, unless stated otherwise. If P = 1 and Q = −1, we get a sequence where each element is the sum the two previous elements, i.e. Un = Un−1 + Un−2. This sequence is known as the Fibonacci sequence and later we will focus mainly on tests based on this sequence.

9 2. Lucas sequences 2.1 Computing Lucas sequences

In 1995, Joye and Quisquater [7] gave an efficient algorithm for com- puting Lucas sequences. Koval [8] gave the following slightly faster version of their algorithm.

Algorithm 2: Computing Un and Vn

Input: Integers P, Q such that P2 − 4Q 6= 0 and n with binary i representation n = ∑ bi2 , bi ∈ {0, 1}.

Output: Elements of Lucas sequences Un and Vn.

1 Vl ← 2, Vh ← P

2 Ql ← 1, Qh ← 1

3 for i ← dlog2 ne − 1 to 0 do

4 Ql ← Ql · Qh

5 if bi = 1 then

6 Qh ← Ql · Q

7 Vl ← Vh · Vl − P · Ql

8 Vh ← Vh · Vh − 2Qh 9 else

10 Qh ← Ql

11 Vh ← Vh · Vl − P · Ql

12 Vl ← Vl · Vl − 2Qh

2 13 Un ← (2Vh − P · Vl)/(P − 4Q)

14 return Un, Vl

We will derive an algorithm (for computing Lucas sequences of the first kind) from the fact that elements of Lucas sequences canbe represented as matrices [9], which allows us to compute Un as the corresponding matrix An (as proven by the following lemma).

Lemma 2.1. Given a Lucas sequence Un(P, Q) and a matrix ! 0 1 A = , −QP

10 2. Lucas sequences

it holds that ! −QU U An = n−1 n . −QUn Un+1

Proof. By induction ! ! −QU U 0 1 An A = n−1 n = −QUn Un+1 −QP ! −QU −QU + PU = n n−1 n = −QUn+1 −QUn + PUn+1 ! −QU U = n n+1 = An+1 −QUn+1 Un+2

n So, to compute Un, we can compute A and take the upper right element. Matrix is associative and so we can use double- and-add method to compute An. From

!2 −QU U A2n = n−1 n −QUn Un+1 ! Q2U2 − QU2 −QU U + U U = n−1 n n−1 n n n+1 2 2 2 Q Un−1Un − QUnUn+1 −QUn + Un+1 we get

U2n = −QUn−1Un + UnUn+1 = (Un+1 − PUn) Un + UnUn+1 2 = 2UnUn+1 − PUn and 2 2 U2n+1 = −QUn + Un+1, which gives us the double step in the following algorithm.

11 2. Lucas sequences

Algorithm 3: Computing Un

i Input: Integers P, Q and n = ∑ bi2 , bi ∈ {0, 1}.

Output: Element of Lucas sequences Un. 1 x ← 0 2 y ← 1

3 for i ← blog2 nc to 0 do 4 z ← x 2 5 x ← 2xy − Px double step 6 y ← −Qz2 + y2

7 if bi = 1 then 8 z ← x 9 x ← y add step 10 y ← Py − Qz

11 return x

Algorithm 3 illustrates the double and add method, but it can be improved. Instead of using the temporary variable z, we can precom- pute the squares x2 and y2 and join both double and add steps into one as in Algorithm 4. Algorithm 4 is slightly faster than Algorithm 3 and as an added benefit, both branches of the if statement take about the same time to compute, which might make this algorithm more resistant to timing attacks. We have implemented the four algorithms for parameters P = 1, Q = −1 in ++ using the NTL [10] library for big integers and we have briefly examined their relative performance. The original algorithm by Joye and Quisquater is about 20% slower than the other three. We have also implemented modular versions of the algorithms using the package of NTL. For small moduli (up to about 250), Algorithms 2,3 and 4 are similar in performance, but for larger moduli Algorithm 2 starts to be slightly faster. Algorithm 2 however, uses division on line 13, which is not defined in the modular version if the modulus is not coprime to P2 − 4Q. For this reason we mostly use Algorithm 4.

12 2. Lucas sequences

Algorithm 4: Computing Un

i Input: Integers P, Q and n = ∑ bi2 , bi ∈ {0, 1}.

Output: Element of Lucas sequences Un. 1 x ← 0 2 y ← 1

3 for i ← blog2 nc to 0 do 2 4 x2 ← x 2 5 y2 ← y

6 if bi = 1 then

7 y ← −2Qxy + Py2

8 x ← −Qx2 + y2 9 else

10 x ← 2xy − Px2

11 y ← −Qx2 + y2

12 return x

2.2 Lucas pseudoprimes

Lucas [11] and Lehmer [12] discovered the following property of Lucas sequences, which is analogous to Fermat’s little theorem.

Theorem 2.2. Let p be an odd prime and P, Q integers such that p - Q and D = P2 − 4Q 6= 0 then

Up−e(p) ≡ 0 (mod p),

 D  where e(p) is the Legendre symbol p .

Proof. For a concise proof see Theorem 3 in [9].

Based on this theorem, Baillie and Wagstaff [6] defined Lucas pseudoprimes in the following way.

13 2. Lucas sequences

Definition 6. Let n be an odd composite integer and P, Q integers such that gcd(n, Q) = 1 and D = P2 − 4Q 6= 0, then n is a with parameters P, Q if

Un−e(n) ≡ 0 (mod n), D  where e(n) denotes the Jacobi symbol n . Similarly to Fermat pseudoprimes, an integer can be Lucas pseudo- prime to some parameters P, Q and not pseudoprime to others, which means that we may define numbers analogous to Carmichael numbers. Williams [13] defined these numbers as positive integers n having the following property: given positive integer D, for all integers P, Q such that gcd(P, Q) = 1, P2 − 4Q = D and gcd(n, QD) = 1, it holds that Un−e(n)(P, Q) ≡ 0 (mod n). These numbers are called absolute Lucas pseudoprimes to discriminant D. For example, for D = 5 we get n = 323 = 17 · 19. And as with Fermat pseudoprimes, we may define stronger versions. Definition 7. An odd composite integer is an Euler Lucas pseudo- prime with parameters P, Q if gcd(n, QD) = 1 and  Q U( − ( )) ≡ 0 (mod n) for = +1 n e n /2 n or  Q V( − ( )) ≡ 0 (mod n) for = −1. n e n /2 n

Since U2n = UnVn (see [14]), Euler Lucas pseudoprimes are also Lucas pseudoprimes. And as there are no stronger Carmichael num- bers for Euler pseudoprimes, Williams [13] showed that there are no absolute Euler Lucas pseudoprimes. And finally we define strong Lucas pseudoprimes. Definition 8. And odd composite integer n is a strong Lucas pseudo- prime with parameters P, Q, if gcd(n, D) = 1 and for n − e(n) = d · 2s, where d is odd, either

Ud ≡ 0 (mod n) or Vd·2r ≡ 0 (mod n) for some 0 ≤ r < s. Strong Lucas pseudoprimes are Euler Lucas pseudoprimes.

14 2. Lucas sequences 2.3 Rank of appearance of n

In this section we will examine the so called rank of appearance (or rank of apparition), which is a notion analogous to order in Fermat pseudoprimes.

Definition 9. Given a positive integer n and parameters P, Q defining a Lucas sequence, the rank of appearance of n is the least positive integer k such that Uk ≡ 0 (mod n).

We will denote it as ωP,Q(n) or simply as ω(n).

As a corollary of Theorem 2.2, for prime p - Q, the rank of appear- ance ω(p) exists. In general, the rank ω(n) exists, if gcd(n, Q) = 1, which can be shown using the correspondence between Uk and the matrix Ak (Lemma 2.1). Matrix A is invertible modulo n if and only if det(A) = Q is a unit modulo n, i.e. gcd(n, Q) = 1. Which means that if gcd(n, Q) = 1, then there exists k such that Ak ≡ I (mod n), where I is the identity matrix, i.e. Uk ≡ 0 (mod n). k Order la(n) of a modulo n has the property that a ≡ 1 (mod n) if and only if la(n) | k. We will show that the rank behaves analogously.

Lemma 2.3. Let k be an integer such that Uk ≡ 0 (mod n), then for any integer l, it holds that Ukl ≡ 0 (mod n).

kl Proof. Let A be the matrix corresponding to Ukl. It can be computed modulo n as !l  l −QU 0 Akl = Ak ≡ k−1 0 Uk+1 ! (−QU )l 0 = k−1 (mod n), l 0 Uk+1 which means that Ukl ≡ 0 (mod n).

Lemma 2.3 shows that for any integer multiple kω(n) of the rank of appearance, it holds that Ukω(n) ≡ 0 (mod n). The following lemma shows the property of the opposite direction.

15 2. Lucas sequences

Lemma 2.4. Let k be an integer such that Uk ≡ 0 (mod n), then ω(n) | k. k Proof. Let Uk ≡ 0 (mod n), then the corresponding matrix A is ! a 0 equivalent to modulo n, and similarly for Uω(n) ≡ 0 (mod n), 0 b ! ( ) c 0 the corresponding matrix Aω n is equivalent to modulo n, 0 d where gcd(abcd, n) = 1. And so, we can compute Uk−ω(n) mod n as

! !−1 ! a 0 c 0 ac−1 0 Ak−ω(n) ≡ ≡ (mod n), 0 b 0 d 0 bd−1 which means that Uk−ω(n) ≡ 0 (mod n). By repeatedly subtracting ω(n) from k we get Uk mod ω(n) ≡ 0 (mod n). If ω(n) - k, then 0 < k mod ω(n) < ω(n). However, by definition of the rank ω(n), it holds that ω(n) ≥ k mod ω(n), therefore ω(n) | k.

So overall, Uk ≡ 0 (mod n) if and only if ω(n) | k. Note that it doesn’t mean that the rank is the period of the sequence. For example for the Fibonacci sequence, as can be seen from the following table, ω(3) = 4, but U3 = 2 6≡ 1 ≡ 13 = U7 (mod 3).

n 0 1 2 3 4 5 6 7 8 9 10 11 Un 0 1 1 2 3 5 8 13 21 34 55 89 Un mod 3 0 1 1 2 0 2 2 1 0 1 1 2

The rank ω(n) divides the period π(n), however the rank of ap- pearance is more suitable for examining the pseudoprimes because if Uk ≡ 0 (mod n) it doesn’t necessarily mean that π(n) | k. For more details about the period see [9]. Another property the order has is that given some base a and αi modulus n and its prime factorization n = ∏ pi , the order la(n) is αi equal to the least common multiple of the individual orders la(pi ). The following lemma shows, that the rank has the same property.

αi Lemma 2.5. Let n be a positive integer with prime factorization n = ∏ pi , αi then ω(n) = lcmi(ω(pi )).

16 2. Lucas sequences = αi ≡ ( ) ( ) | Proof. Let qi pi . By definition Uω(qi) 0 mod qi . Since ω qi ( (q )) U ≡ ( q ) lcmi ω i , from Lemma 2.3 we get lcmi(ω(qi)) 0 mod i . By U ≡ ( n) (n) | Chinese Remainder Theorem lcmi(ω(qi)) 0 mod , and so ω lcmi(ω(qi)). Since qi | n and Uω(n) ≡ 0 (mod n), it holds that Uω(n) ≡ 0 (mod qi), so ω(qi) | ω(n) for each i. Therefore overall, lcmi(ω(qi)) | ω(n) | lcmi(ω(qi)), which means that ω(n) = lcmi(ω(qi)).

Renault [9] shows that for pα, it holds that ω(pα) | α−1 αi p (p − e(p)). Which means that for n = ∏ pi we may define func- αi−1 tion ψ(n) = ∏ pi (pi − e(pi)) such that ω(n) | ψ(n). The function αi−1 ψ is analogous to Euler’s totient function ϕ(n) = ∏ pi (p − 1). From now on we will consider only pseudoprimes n such that e(n) = −1, because in the Baillie-PSW test the parameters P, Q are such that e(n) = −1. As we will see later, this makes the test stronger. Similarly as for la(n) we can derive the following three lemmas for ω(n). Lemma 2.6. Let p be a prime and n = pk be a Lucas pseudoprime (such that e(n) = −1), then ω(p) | p − e(p) and ω(p) | k − e(k).

Proof. We already know from Theorem 2.2 that ω(p) | p − e(p).

0 ≡ n − e(n) = (p − e(p))(k + e(k)) − pe(k) + ke(p) ≡ pe(p) + ke(p) ≡ e(p)(k − e(k))(mod ω(p))

Lemma 2.7. Let n = pq, where p, q are distinct primes (such that e(n) = −1), then n is a Lucas pseudoprime if and only if

ω(p) | q − e(q)

and ω(q) | p − e(p).

17 2. Lucas sequences

Proof. One direction of the proof is given by Lemma 2.6, so assume ω(p) | p − e(p) and ω(p) | k − e(k). Then

n − e(n) = (p − e(p))(q + e(q)) + qe(p) − pe(q) ≡ e(q)e(p) − e(p)e(q) = 0 (mod ω(p))

Similarly, n − e(n) ≡ 0 (mod ω(q)) and so by Chinese Remainder Theorem n − e(n) ≡ 0 (mod lcm(ω(p), ω(q)) = ω(n)).

Lemma 2.8. Let n = kl, where k, l are either primes or Lucas pseudoprimes (such that e(n) = −1) and gcd(k, l) = 1. Then n is a Lucas pseudoprime if and only if ω(k) | l − e(l) and ω(l) | k − e(k).

Proof. The proof is analogous to the proof in Lemma 2.7.

To compute the rank, we may employ the following algorithm, which is built on similar principles as Algorithm 1 which computes the order.

Algorithm 5: Computing ω(n)

αi Input: An odd positive integer n with prime factorization of ψ(n) = ∏ pi and integers P, Q such that n - Q and P2 − 4Q 6= 0 Output: Rank of appearance ω(n)

1 t ← ψ(n)

2 forall i do

αi 3 t ← t/pi

4 a1 ← Ut mod n

5 while a1 6= 0 do

6 a1 ← Utpi mod n

7 t ← t · pi

8 return t

18 2. Lucas sequences

The values Utpi mod n on line 6 can be computed either by one of the algorithms from Section 2.1 with time complexity O(log(tpi)) or the whole algorithm can be slightly modified to store and compute t Ut as the corresponding matrix A and then Utpi can be computed as tp tpi A i = A in O(log(pi)) time.

2.4 Lucas pseudoprimes in the Fibonacci sequence

In the rest of this work we will consider Lucas pseudoprimes mostly with parameters P = 1, Q = −1, i.e. to the Fibonacci sequence. Let n be a Fermat pseudoprime to base 2 and p some prime, then if p2 | n, we know that p has to be a Wieferich prime, i.e. it has to 2 satisfy l2(p ) | p − 1. There are only two known Wieferich primes, and so when searching for pseudoprimes we can mostly limit our search to square free integers. For Lucas pseudoprimes (to the Fibonacci sequence) there’s an analogous type of primes: Wall-Sun-Sun primes. Let p be a prime and n = pαk a Lucas pseudoprime such that e(n) 6= 0. Then ω(pα) | ω(n) | pαk − e(n), which means that p - − ω(pα). Since ω(pα) | ψ(pα) = pα 1(p − e(p)) and p - ω(pα), we get

ω(p2) | ω(pα) | p − e(p).

2 Primes p such that ω1,−1(p ) | p − e1,−1(p) are call Wall-Sun-Sun primes. There are no known Wall-Sun-Sun primes and all primes have been checked up to 28 · 1015 [5].

19

3 The Baillie-PSW test

In 1980 Baillie, Pomerance, Selfridge and Wagstaff examined Fermat pseudoprimes [1] and Lucas pseudoprimes [6] and noticed that if the parameters are chosen properly, then the two tests seem to be independent. Based on this observation they devised the following primality test [6]. Given an odd integer n which we want to test for primality, perform the following steps:

1. Optionally perform trial division test to some convenient limit. In other words, try dividing n by all primes up to the limit. If some such prime divides n, then n is composite, otherwise continue.

2. Perform strong Fermat test to base 2. If n fails, then it is composite, otherwise continue.

3. Chose parameters P, Q for strong Lucas test by one of these two methods:

• method A: Let D be the first element of the sequence 5, −7, 9, −11, 13, ... D  1−D such that n = −1. Set P = 1 and Q = 4 . • method B: Let D be the first element of the sequence 5, 9, 13, 17, 21, ... D  = −1 P such√ that n . Let be the least odd number exceed- P2−D ing D and let Q = 4 . 4. Perform strong Lucas test with parameters P, Q. If n fails, then it is composite, otherwise it is almost certainly a prime.

Note that in the search for parameters P, Q in step 3, if we en- D  counter D such that n = 0, then n is composite and we can stop D  the algorithm. If n is a perfect square, then n ≥ 0 for all D and so we never get −1. To solve this problem, if we do not find suitable D after trying several values, we may employ Newton’s method to test for n being a perfect square. The authors show [6] that the average numbers of values that have to be tried before finding D such that D  n = −1 is 1.790479091 for method A and 1.922741874 for method

21 3. The Baillie-PSW test

B. Method A is used most often, because it is simpler and seems to result in less Lucas pseudoprimes than method B. D  The reason why we need parameters P, Q such that n = −1 is D  that if n = 0, then gcd(n, D) 6= 1, and so we know n is composite, D  and if n = 1, then Lucas and Fermat test are somehow related. As we have seen throughout the first two chapters, Fermat and Lu- cas pseudoprimes have many of the same properties. Grantham [15] noticed these similarities and formulated a definition of Frobenius pseudoprimes, which generalizes Fermat, Lucas and other kinds of pseudoprimes. He presented theorems, which describe the relations between the different kinds of pseudoprimes. As a corollary ofthe D  theorems, if n is a Lucas pseudoprime and n = 1, then n is also a Fermat pseudoprime (for certain parameters P, Q and base a). At the end of paper [1] the authors offer $30 for a Baillie-PSW pseudoprime, but so far, no one has found it. They found no Baillie- PSW pseudoprimes up to 108. Gilchrist [16] further showed that there are no Baillie-PSW pseudoprimes up to 264, which means that for n ≤ 264 the test is deterministic. However according to Pomerance’s heuristic argument [17] there should be infinitely many Baillie-PSW pseudoprimes.

3.1 Challenge pseudoprimes

In this thesis we are mainly interested in a somewhat weaker version of the test, whose pseudoprimes are defined as follows.

Definition 10. A composite n is a PSW-challenge pseudoprime if it satisfies the following three conditions

(1) n ≡ ±2 (mod 5), − (2) 2n 1 ≡ 1 (mod n),

(3) Un+1(1, −1) ≡ 0 (mod n).

The term PSW-challenge pseudoprime is not well established (that is also why the title of this thesis is Baillie-PSW pseudoprimes), how- ever we follow the terminology of Shallue and Webster [18] who named the pseudoprimes as such, because allegedly [18, 19, 15] there’s

22 3. The Baillie-PSW test

an outstanding offer of $620 by Pomerance, Selfridge and Wagstaff for these pseudoprimes. 5  The first condition is equivalent to n = −1, and so the three conditions together are basically the Baillie-PSW test with weak (as op- posed to strong) versions of the Fermat and Lucas test and only for the Fibonacci sequence. For D = 5 the Baillie-PSW test is strictly stronger and so any Baillie-PSW pseudoprime will also be PSW-challenge pseu- doprime. However for D 6= 5 a Baillie-PSW pseudoprime n will most likely not be PSW-challenge pseudoprime, and in fact, if D is chosen 5  by method A, then n 6= −1, and so n cannot be a PSW-challenge pseudoprime. Gilchrist also checked the weaker version of the Baillie-PSW test and showed that there are no PSW-challenge pseudoprimes up to 264 [16]. Shallue and Webster moved the bound to 280 for pseudo- primes with two or three prime [18]. No one has found a PSW-challenge pseudoprime yet, even though, heuristically, there should be infinitely many [19].

3.1.1 Admissibility

If n is a PSW-challenge pseudoprime, then l2(n) | n − 1 and ω(n) | n + 1, therefore gcd(l2(n), ω(n)) ≤ gcd(n − 1, n + 1) ≤ 2. We will call odd integers n such that

gcd(l2(n), ω(n)) ≤ 2

admissible. If k divides n, then l2(k) divides l2(n) and ω(k) divides ω(n), there- fore gcd(l2(k), ω(k)) ≤ gcd(l2(n), ω(n)), and so

if n is admissible, then k is also admissible.

In other words, a necessary condition for n being a PSW-challenge pseudoprime is that all divisors of n have to be admissible. With this condition we can for example rule out some prime divisors of n. For instance 11, 19, 29, 31, 41, 59, 61, 71, 79, 89 are all the inadmissible primes less than 100. We will use this necessary condition prominently later on, and so we might wonder; how strong is it? Figure 1 shows comparison

23 3. The Baillie-PSW test of the prime counting function π(x) and a function πa(x) counting πa(x) 1 all admissible primes up to x. The ratio π(x) seems to tend to 2 as x πa(x) = 1 increases, so we raise a conjecture that limx→∞ π(x) 2 .

Figure 1: Comparison of distribution of all primes and distribution of admissible primes. The top graph shows the prime counting function π and a function πa counting admissible primes in the interval [0, x]. The bottom graph shows their ratio.

Prime numbers can be found with an algorithm called sieve of Eratosthenes, which works as follows.

1. Given a bound B2, allocate B2 − 1 bits of memory and set them all to 1. The first will be indexed as b2.

2. For n from 2 to B do the following. If bn is set to 0 then continue the loop, otherwise n is a prime. Set bits indexed with a multiple of n to 0.

24 3. The Baillie-PSW test

2 3. For n from B + 1 to B , if bn is set to 1, then n is a prime.

The algorithm is built on the fact that√ n is a prime if and only if n is not divisible by any primes up to n. Admissible integers have similar property, that if n is divisible by an inadmissible integer, then n is also inadmissible. And so we can sieve for admissible integers with the following algorithm.

1. Given a bound B2, allocate B2 − 1 bits of memory. The first bit will be indexed as b2. Set bits with even indices to 0 and bits with odd indices to 1.

2. For n from 3 to B do the following. If bn is set to 0, then n is inadmissible, so continue the loop. Otherwise test admissibility of n. If it is inadmissible, then set bits indexed with multiple of n to 0. 2 3. For n from B + 1 to B , if bn is set to 0, then n is inadmissible. Otherwise test admissibility of n.

With this algorithm, for every inadmissible integer we find, we rule out all it’s multiples. Unfortunately, since an integer, which has only admissible divisors, can still be inadmissible, we have to test all the integers which we didn’t rule out. Another way to find admissible integers is to construct them as products of admissible primes as follows.

1. Given a bound B2, get a set S of admissible primes (possibly by testing primes obtained by the sieve of Eratosthenes) up to B. Initialize the set of admissible integers M to a copy of S.

2. For each m ∈ M and p ∈ S compute n = mp. If n ≤ B2 and is 0 admissible, then add it to M . 0 3. Store integers of M in Mall and set M = M . Repeat steps 2 and 0 3 as long as there are new products in M .

In both algorithms only the integers, which have all divisors admis- sible, are tested for admissibility. So, the main difference between the algorithms is in how they store the integers. The first algorithm stores

25 3. The Baillie-PSW test admissibility of the integers as a bit array of size n. This includes the zero bits for inadmissible integers. The second algorithm stores the admissible integers directly, and so the memory requirement depends on how common the admissible integers are. As we can see in Figure 2, admissible integers are about as common as primes, and so the mem- ory requirements of the two algorithms are very close. The second algorithm needs about 20% more memory (possible more depending on the variable type used in the underlying programming language). Overall, both algorithms are close in performance, however the second algorithm allows us to work directly with the integers and make some adjustments, such as storing the orders and rank along- side the integers for faster computation of admissibility, or ignoring integers which are divisible by perfect squares.

Figure 2: Comparison of distribution of all primes and distribution of admissible integers. The top graph shows the prime counting function π and a function α counting all admissible integers in the interval [0, x]. The bottom graph shows their ratio. The comparison shows that there are about as many admissible integers as there are primes, although the two distributions may not be related.

26 3. The Baillie-PSW test

And finally, we present an algorithm (Algorithm 6) to test admis- sibility of integers. We could simply compute gcd(l2(n), ω(n)) and then test whether the result is less or equal to 2, however computing the order and especially the rank is somewhat time demanding. We do not actually need to compute the order nor the rank, we just need to find primes which divide the order and then test whether anyof them (the prime 2 needs to be treated separately) divides the rank.

Algorithm 6: Deciding admissibility of integers

Input: An odd positive integer n > 2. Output: True if n is admissible, False otherwise. 1 forbidden_primes ← new empty list

2 foreach (p, e) ∈ factor(ϕ(n)) do 3 f ← e if p 6= 2 else e − 1 find primes which cannot divide the rank ( ) ( f ) 4 if 2ϕ n / p 6≡ 1 (mod n) then 5 add p to forbidden_primes

6 t ← 1

7 foreach (p, e) ∈ factor(ψ(n)) do 8 if p 6∈ forbidden_primes then e 9 t ← t · p test if the rank needs any of the forbidden primes 10 if 2 - t then 11 t ← 2t

12 return Ut ≡ 0 (mod n)

The function factor factors the given integer and returns a list of tu- ples (prime, power). The function ϕ is called multiple times, however this is just for simplicity of the pseudocode. It should be called only once and the result stored in a variable. As a final note, the admissibility depends on the bases a in the Fermat test and the parameters P, Q in the Lucas test. We can examine admissibility for any (meaningful) combination of the parameters. However, in the full Baillie-PSW test, the parameters P, Q are chosen

27 3. The Baillie-PSW test individually for each tested integer n, and so for different integers we are interested in different admissibilities, which makes the examina- tion considerably more complicated.

3.1.2 Computing last prime

Given an admissible integer k we may attempt to compute prime p such that n = pk is a challenge pseudoprime. Shallue and Webster [18] used the following two methods. In the first method we use the necessary condition that if n is a Fermat pseudoprime, then l2(p) | k − 1, and similarly if n is a Lucas − pseudoprime, then ω(p) | k − e(k). Which means that p | 2k 1 − 1 and p | Uk−e(k), and so we can compute p as k−1 p | gcd(2 − 1, Uk−e(k)). This condition is only necessary, so n = pk may still not be challenge pseudoprime. On the other hand, this method gives us all possible p for which n = pk can be a challenge pseudoprime. It works best for k−1 small k, because both 2 − 1 and Uk−e(k) grow exponentially, and computing with such quickly becomes too slow (for k > 270 it starts to be infeasible). Note that the Fibonacci sequence grows k−1 slightly slower than exponential with base 2, i.e. Uk−e(k) ≤ 2 − 1, − and so we may compute the greatest common as gcd(2k 1 − 1 mod Uk−e(k), Uk−e(k)), nevertheless for large k the following second method may be more suitable. The second method is based on the fact that if n = pk is a challenge pseudoprime, then n = pk ≡ 1 (mod l2(n)) and n = pk ≡ −1 (mod ω(n)), so −1 p ≡ k (mod l2(k)), p ≡ −k−1 (mod ω(k)). −1 The inverse k modulo l2(k) always exists, because l2(n) | n − 1, −1 therefore gcd(l2(k), k) ≤ gcd(l2(n), n) = 1. Similarly, the inverse k modulo ω(k) always exists, since ω(n) | n + 1, so gcd(ω(k), k) ≤ gcd(ω(n), n) = 1. By Chinese Remainder Theorem we can compute a = p mod lcm(l2(k), ω(k)). And then we get possible values for p as

a + b · lcm(l2(k), ω(k))

28 3. The Baillie-PSW test

for integer b from 0 up to some bound. For example, if we want to check all p, such that n = pk = (a + b · lcm(l2(k), ω(k))) · k ≤ B, for some B b B k bound , then we need to try all from 0 to k·lcm(l2(k),ω(k)) . Since is admissible, lcm(l2(k), ω(k)) is equal to l2(k) · ω(k) or (l2(k) · ω(k))/2. This method is suitable for large k, however its disadvantage is that it doesn’t give us all the possible values of p.

29

4 Search for challenge psudoprimes

In this chapter we present our attempts to find the PSW-challenge pseudoprimes. We already know there are no pseudoprimes up to 264 [16] and no pseudoprimes with two or three prime divisors up to 280 [18]. These bounds were checked using considerable amount of computational power, and so increasing the bounds using the same algorithms is not likely to yield new results. For this reason we focused on examination of the pseudoprimes in several special cases. 1. Firstly we examined pseudoprimes which have all prime divi- sors lower than some given bound B. The reason we may be interested in pseudoprimes with small prime divisors is that large primes are likely to have large order or rank, and so the conditions that l2(p) | k − 1 and ω(p) | k − e(k) are more restric- tive. To search for these pseudoprimes, we constructed a set of all admissible square-free products of the small primes using the second algorithm from Section 3.1.1 and applied the Fermat and the Lucas tests. − 2. Another case we examined was pseudoprimes in the form 2n 1 − 1. By Lemma 1.5, if n is a Fermat pseudoprime (to base 2) or − prime, then 2n 1 − 1 is also Fermat pseudoprime or prime, and − so many of the integers in the form 2n 1 − 1 will be Fermat pseudoprimes. We search for PSW-challenge pseudoprimes in − the form 2n 1 − 1 as products of primes which have rank equal to a power of two. 3. If n = pq is a PSW-challenge pseudoprime, where p, q are dis- tinct primes, then e(n) = e(p)e(q) = −1, so either e(p) = 1 or e(q) = 1. We searched for pseudoprimes with two prime factors by first finding admissible primes p with e(p) = 1 and then using divisibility properties of the order and the rank to find the other prime q. 4. Similarly to the previous case, if n is a PSW-challenge pseudo- prime with even number of prime divisors, then at least one of the primes has e(p) = 1. We used the primes p with e(p) = 1, which we found in the previous case, to search for pseudoprimes

31 4. Search for challenge psudoprimes

with even number of prime divisors. Previously it was known that there are no such pseudoprimes up to 264. We have moved this bound to 270.

5. Lastly, we examined the set of primes given by Chen and Greene [20]. The set contains 1248 admissible primes, such that each square-free product of these primes is also admissible. It is es- timated that there should be about 748 pseudoprimes among the 21248 square-free products. We transformed the problem of finding the pseudoprimes to a Knapsack problem, which weat- tempted to solve by finding approximate solutions to the shortest vector problem using LLL and BKZ algorithms.

4.1 Small prime factors

In general, we may search for PSW-challenge pseudoprimes by find- ing admissible integers and then applying the Fermat and the Lucas tests. Using the second algorithm from Section 3.1.1 we may construct admissible integers as products of admissible primes, however, as we have seen, there are a lot of admissible integers, and so we need to somehow limit the construction. In [18], Shallue and Webster limit it to two or three prime divisors and the bound 280. Here we will not limit the number of prime divisors nor the maximum size of the pseu- doprimes, but instead we will limit the maximum size of the prime divisors. The general idea of the algorithm is as follows:

1. Create a set S of admissible primes up to the given bound B. Copy set S to set M of integers to be tested.

0 2. Compute the set M of new admissible products p · m for each p ∈ S and m ∈ M such that p - m.

0 3. Set M = M and test each m ∈ M with the Fermat and Lucas tests.

4. Repeat step 2 and 3 until there are no new admissible integers 0 in M .

32 4. Search for challenge psudoprimes

In step 2 we require that p - m (i.e. that the new products are square free), because we assume that p is neither Wieferich nor Wall- Sun-Sun prime. The algorithms eventually stops, since there’s a finite number of square-free admissible products of primes from S. To test the admissibility of the primes and the products we can use Algorithm 6, however it is more effective to use the order and the rank instead, as we show next.

4.1.1 Admissibility from order and rank

If we compute the order and the rank of each prime p and product m, then we can test admissibility of p · m using the following lemma. Lemma 4.1. Let p, m be admissible integers such that gcd(p, m) = 1. Then n = pm is admissible if and only if

gcd(l2(p), ω(m)) ≤ 2 and gcd(l2(m), ω(p)) ≤ 2.

Proof. First assume n is admissible, i.e. gcd(l2(n), ω(n)) ≤ 2. Then, because l2(p) | l2(n) and ω(m) | ω(n), we get gcd(l2(p), ω(m)) ≤ gcd(l2(n), ω(n)) ≤ 2 and similarly because l2(m) | l2(n) and ω(p) | ω(n) we get gcd(l2(m), ω(p)) ≤ gcd(l2(n), ω(n)) ≤ 2. In the other direction, assume that gcd(l2(p), ω(m)) ≤ 2 and gcd(l2(m), ω(p)) ≤ 2. Also p, m are admissible, so gcd(l2(p), ω(p)) ≤ 2 and gcd(l2(m), ω(m)) ≤ 2. Let d = gcd(l2(n), ω(n)) and assume by contradiction that d > 2. If d > 2, then d is divisible by some q which is either and odd prime or q = 4. Since p, m are coprime, it holds that l2(n) = lcm(l2(p), l2(m)) and ω(n) = lcm(ω(p), ω(m)). So, q | l2(p) or q | l2(m), and q | ω(p) or q | ω(m), and since q > 2, this contradicts either one of the two assumption conditions or the admissibility of p or m.

So, in step 2 of the algorithm, given p, l2(p), ω(p) and m, l2(m), ω(m), we test admissibility of p · m by checking that gcd(l2(p), ω(m)) ≤ 2 and gcd(l2(m)ω(p)) ≤ 2. The rank and order of n can be computed as l2(n) = lcm(l2(p), l2(m)) and ω(n) = lcm(ω(p), ω(m)). This method is much faster than repeatedly using Algorithm 6 which relies on fac- torization of ϕ(n) and ψ(n) and computation of modular power of two and modular Lucas sequence.

33 4. Search for challenge psudoprimes

In step 3, to see if n is a challenge pseudoprime, we check whether n−1 l2(n) | n − 1 and ω(n) | n + 1, which is faster than checking 2 ≡ 1 (mod n) and Un+1 ≡ 0 (mod n).

4.1.2 Full version of the algorithm

1. Create a set S of triples (p, l2(p), ω(p)) for each odd prime p up to the given bound B. Test admissibility of the primes by checking gcd(l2(p), ω(p)) ≤ 2 and remove all triples belonging to inadmissible primes. Copy S to set M of integers to be tested. 0 0 0 0 2. Compute the set M of triples (m , l2(m ), ω(m )) for new ad- 0 missible products m = p · m in the following way. For each p ∈ S and m ∈ M such that p - m, gcd(l2(p), ω(m)) ≤ 2 0 and gcd(l2(m), ω(p)) ≤ 2, compute the new product m = 0 0 p · m, its order l2(m ) = lcm(l2(p), l2(m)) and its rank ω(m ) = lcm(ω(p), ω(m)). 0 3. Set M = M and test each m ∈ M with the Fermat and Lucas tests by checking wheather l2(m) | n − 1 and ω(m) | n + 1.

4. Repeat step 2 and 3 until there are no new admissible integers 0 in M .

We implemented the above algorithm in Python and ran it for B = 1400. The computation required 56GB of RAM and ran for about 30 hours. For higher values of B the computation exceeded our limit of 64GB of RAM. The set S contained 114 admissible primes and the algorithm ended after 25 iterations. Overall, the algorithm constructed and checked 770854835 admissible products, none of which are a challenge pseudoprime. The highest product had 218 bits.

4.2 Pseudoprimes of the form 2n−1 − 1

− We started the search for pseudoprimes in the form 2n 1 − 1 simply by applying the PSW-challenge test to all integers 2i − 1 for i up to 215. We implemented the test in C++ using the Big Integer class of NTL library [10]. To compute the Fibonacci sequence we implemented slightly modified version of the double-and-add algorithm. Since n + 1 =

34 4. Search for challenge psudoprimes

2i − 1 + 1 = 2i, the binary representation has one leading 1 and the rest are 0s, so we do one add step add the beginning and then do only double steps. The only integers n which passed the test were those with i = 2, 3, 7, 19, 31, 107, 127, 607, 1279, 2203, 4423. All these n are Mersenne primes [21]. Testing the integers one by one quickly became too slow, so instead we tried constructing them as products of suitable primes.

4.2.1 Primes with rank equal to a power of two

If n = 2i − 1 is a challenge pseudoprime and p is a prime such that p | n, then ω(p) | ω(n) | 2i. We searched, in several ways, for primes whose rank is a power of two, and overall we obtained 28 of such primes. Some of the primes were too large, therefore we used only primes p ≤ 24423, which left us with 19 primes. We tested all 219 square-free products of the these primes and none of them were a challenge pseudoprime nor were they in the form 2i − 1. To search for these primes, we firstly checked all primes upto 230, which resulted in 13 primes where 9 were admissible. Our next approach was to factor U2i . Lucas [11] shows that U2n = UnVn, so

i−1 U2i = ∏ V2m . m=0

2 n 2 V = V − Q V m V − Also, 2n n 2 , which gives us a way to compute 2 as 2m−1 2. To factor U2i , we have to factor each V2m , which unfortunately still grows too fast. We computed and factored V2m for m ≤ 10. The last value V210 has 356 bits and to factor it we used an online tool [22]. This method gave us another 3 admissible primes.

4.2.2 Mersenne primes

Our third approach was to examine Mersenne primes, since these primes p = 2j − 1 with e(p) = −1 naturally have rank ω(p) | 2j. Mersenne primes are generally too large for Algorithm 6 or the com- putation of gcd(l2(p), ω(p)), but the following lemma shows that e(p) = −1 is a sufficient condition for the admissibility of Mersenne primes.

35 4. Search for challenge psudoprimes

Lemma 4.2. Let p = 2j − 1 be a Mersenne prime, then p is admissible if e(p) = −1.

Proof. Let p = 2j − 1 be a Mersenne prime, then j is the least positive k integer k such that 2 ≡ 1 (mod p), and so by definition l2(p) = j. It is known that if 2j − 1 is a prime, then j is also a prime. If e(p) = −1, then ω(p) is a power of two, and so gcd(l2(p), ω(p)) ≤ 2, i.e. p is admissible.

We require that ω(p) | ω(n) | 2i, i.e. the rank of p has to be a power of two, which is not possible if e(p) = 1. So, overall, we are interested precisely in Mersenne primes with e(p) = −1. Out of the 51 known Mersenne primes [21], 20 have e(p) = −1 (4 of which we have obtained by previous methods). As a side note, if n = pq is a challenge pseudoprime, then p, q are not Mersenne primes. We can show this by contradiction. Assume p = j l (p) j 2 − 1 is a Mersenne prime. Then j = l2(p) | q − 1, so 2 2 = 2 ≡ 1 (mod q), i.e. q | 2j − 1 = p, and therefore q = p. Then e(n) = e(p2) = 1 6= −1, which contradicts n being challenge pseudoprime.

4.3 Two prime factors

Let n = pq, where p, q are distinct primes, be a challenge pseudoprime, then e(n) = e(p)e(q) = −1, so either e(p) = 1 or e(q) = 1 (without loss of generality we assume that e(p) = 1). Admissible primes p with e(p) = 1 are quite rare. The reason is that on the one hand both the order l2(p) and the rank ω(p) divide p − 1, but on the other hand, since p is admissible, the of the order and the rank is at most two (i.e. gcd(l2(p), ω(p)) ≤ 2). At the end of their paper [18] Shallue and Webster noted that they found 7 such primes when generating primes up to 240, however they do not say whether they used these primes in any way to speed up their algorithm or not. In this and the next section we present several ways these primes can be used. To find a pseudoprime with two prime factors, we first searched for the admissible primes p with e(p) = 1 and then tried to find suitable primes q.

36 4. Search for challenge psudoprimes

4.3.1 Admissible primes p with e(p) = 1

We implemented an algorithm which tested the admissibility of primes with e(p) = 1 up to 245 and we found 3 other primes on top of the 7 already known [18]. We implemented slightly adjusted version of Algorithm 6 in C++. To factor ϕ(p) = ψ(p) = p − 1 we used trial division and to generate the primes we used an open source library primesieve [23]. We ran the code in MetaCentrum [24] on 1024 machines in parallel and the overall computation took about three CPU years. The following table contains the three newly found admissible primes p with e(p) = 1 and their orders and ranks.

p log2(p) l2(p) ω(p) 2847994170041 41.37 806587 1765460 10052678938039 43.19 69 145690999102 20474359556069 44.21 145084 141120727

4.3.2 Computing the other prime

To find q such that n = pq is a PSW-challenge pseudoprime we could use one of the two methods described in Section 3.1.2. However, as we already discussed, the second method doesn’t give us all the possi- bilities and the first method calculates with Up−1, which in this case is too large. We present another way of computing the second prime. p−1 Instead of computing q as a factor of gcd(2 − 1, Up−1), we will l2(q) compute q as a factor of gcd(2 − 1, Uω(q)). The problem of course is that since we are looking for q, we do not know l2(q) and ω(q), however we know that l2(q) | p − 1, ω(q) | p − 1 and gcd(l2(q), ω(q)) ≤ 2. So, we factor p − 1 and distribute the factors among the order and the rank of q, such that q is admissible, which gives us several candidates for l2(q) and ω(q). Let a be the number of factors of p − 1, then there are 2a ways to distribute the factors of p − 1 such that the order and the rank of q are coprime. The order and rank can also have gcd = 2, + and so overall there are at most 2a 2 possibilities. Among the ten primes, 4562284561 and 10052678938039 have the most number of unique prime factors of p − 1: seven. So, in the worst case we have to try only 512 different combinations of the order and the rank.

37 4. Search for challenge psudoprimes

Of course, among these possibilities there’s a combination where ( ) the order or the rank are almost as large as p − 1, which means 2l2 q − 1 or Uω(q) can still be too large. However, since both the order and rank p divide p − 1, one of them is at most 2 p − 1. So, if the order is lower l2(q) l2(q) than the rank, then we compute gcd(2 − 1, Uω(q) mod (2 − 1)), ( ) and if the rank is lower than the order, the we compute gcd(2l2 q − U U ) 1 mod √ω(q), ω(q) . In either case, both arguments of the gcd are at 2 p−1 most 2 which is considerably lower than Up−1. We have implemented this method in Python and ran it for all the 10 primes. We found no q for which n = pq is a challenge pseudoprime. Since we tested all primes p with e(p) = 1 up to 245, if there is a challenge pseudoprime n = pq, then the prime p with e(p) = 1 must be larger than 245.

4.4 Even number of prime factors

Let n have even number of prime factors (counting multiple times for prime powers), then if n is a PSW-challenge pseudoprime, at least one of the prime divisors p has e(p) = 1. As in the case of two prime factors, we can use the primes with e(p) = 1 to search for the challenge pseudoprimes. In this section we describe the method we used to check that there are no challenge pseudoprimes up to 270 with even number of prime factors. Since there are no Wall-Sun-Sun primes up to 254 [5], any challenge pseudoprime lower than 270 has to be square free, and so we can write it as n = pk, where p is a prime with e(p) = 1 and p - k. We will treat two different cases: either p < 245 and it is one of the ten known primes, or p ≥ 245 and it is unknown to us. We will discuss the latter case first.

4.4.1 Case of p ≥ 245

Assume p ≥ 245, then k ≤ 225. We have generated a list of all 1142315 admissible integers k ≤ 225 with e(k) = −1. We removed primes from this list, because the case of two prime factors has been checked by Shallue and Webster [18] up to 280, which reduced the list considerably

38 4. Search for challenge psudoprimes

to only 110191 composite integers. And finally, we applied the first method from Section 3.1.2 to each integer k on this list, i.e. we searched k−1 for p as p | gcd(2 − 1, Uk+1). We have found no primes p such that n = pk would be a challenge pseudoprime.

4.4.2 Case of p < 245

For the second case, assume that p is one of the ten known primes with e(p) = 1. We try to find k in a similar way to the second method of computing the last prime in Section 3.1.2. If n = pk is a challenge pseudoprime, then pk ≡ 1 (mod l2(n)) and pk ≡ −1 (mod ω(n)), so −1 k ≡ p ≡ 1 (mod l2(p)) and k ≡ −p−1 ≡ −1 (mod ω(p)).

Which means that by Chinese Remainder Theorem we can compute a = k mod lcm(l2(p), ω(p)) and search for k as a + b · lcm(l2(p), ω(p)) for integer b from 0 up to some bound. To check pseudoprimes n = pk up to bound 270, we need to try − (p· (l (p) (p))) 270 log2 lcm 2 ,ω potential values of k. The larger p is, the less values of k we need to try. The following table show the approximate values of log2(p · lcm(l2(p), ω(p))) for the ten known primes p.

p l2(p) ω(p) log2(p · lcm(l2(p), ω(p))) 61681 40 1542 30.82 363101449 171436 1059 55.87 4278255361 80 6684774 59.98 4562284561 120 147934 55.16 4582537681 160453 1428 59.86 26509131221 748 14176006 66.92 422013019339 290442546 2906 77.23 2847994170041 806587 1765460 81.74 10052678938039 69 145690999102 86.38 20474359556069 145084 141120727 88.43

39 4. Search for challenge psudoprimes

As is evident from the table, the bottleneck of this method is the prime p = 61681, for which we had to test about 239 values of k. We have implemented this method and ran it for all the ten primes, searching for n = pk up to 270, and we have found no k such that n = pk would be challenge pseudoprime. So overall, we have confirmed that there are no PSW-challenge pseudoprimes with even number of prime divisors up to 270.

4.5 Lattice based solutions

Chen and Greene [19] attempted to find a PSW-challenge pseudo- prime by constructing the following system of primes. Let M, N be integers such that gcd(M, N) ≤ 2 and let P be a set of primes p such that p - MN, l2(p) | M and ω(p) | N.

Then for any subset A ⊆ P, the product n = ∏p∈A p is a challenge pseudoprime if n ≡ 1 (mod M) and n ≡ −1 (mod N). (4.1) Assuming congruence classes of n modulo M, N are approximately uniformly distributed, the expected number of challenge pseudo- primes n constructed as a product of a subset of primes in P will be about 2|P| . ϕ(MN) Chen and Greene were constructing systems, which contained as few primes as possible and at the same time had at least one expected pseudoprime. The smallest system they published [20] contains 1248 | | 2 P ≈ 1248 primes and ϕ(MN) 740, meaning that out of the 2 possible prod- ucts we can expect about 740 to be challenge pseudoprimes. They do not say whether they attempted to find the pseudoprimes in the system.

4.5.1 Searching for the pseudoprimes

Since Chen and Greene did not say whether they tested any of the products, we started by trying all products made of exactly three

40 4. Search for challenge psudoprimes primes. None of these products were a PSW-challenge pseudoprime. All the primes in the system have e = −1, and so any challenge pseudoprime in the system has to be a product of an odd number of primes. So overall, none of the products of two, three or four primes from the system are challenge pseudoprimes. Clearly, testing all the possible products one by one is computationally infeasible and so we attempted to find a solution to the congruences 4.1. We transformed the problem of finding the pseudoprimes among the 21248 possibilities to a multidimensional modular Knapsack prob- lem, which we then attempted to solve by finding short vectors ina lattice using the LLL and BKZ algorithms.

4.5.2 Transformation of the problem

Let mi, ni be prime powers such that M = ∏ mi and N = ∏ ni, then by Chinese Remainder Theorem equations 4.1 are equivalent to the set of equations

n ≡ 1 (mod mi) and n ≡ zi (mod ni) (4.2)

for some zi. If mi, ni are odd or equal to 2, the modular multiplicative Z∗ Z∗ groups mi , ni are cyclic, meaning they contain some generators ai, bi, and so we may apply discrete logarithms to the equations 4.2 and obtain ≡ ( ( )) ≡ ( ( )) logai n 0 mod ord ai and logbi n logbi zi mod ord bi , (4.3) where ord(ai), ord(bi) are orders of the generators in their respective 13 Z∗ groups. Note that N is divisible by 2 and 213 is not cyclic, however it can be generated by two generators (e.g. 8191 and 5), which gives us two logarithmic equations in 4.3 instead of one. To find the pseu- ∈ doprimes, we compute logai p and logbi p for each p P and try to compute A ⊆ P such that it has at least two primes and log p ≡ 0 (mod (a )) log p ≡ log z (mod (b )). ∑ ai ord i and ∑ bi bi i ord i p∈A p∈A

This problem is similar to the Knapsack problem, which has many variants. This variant is multidimensional and in addition each di- mension is computed in different modular group (corresponding to

41 4. Search for challenge psudoprimes the different generators). In general, these kinds of problems areNP- complete, so we suspect this variant is also NP-complete, although we do not have a proof. Assuming the problem is NP-complete (and NP6=P), there is no polynomial algorithm which would give us a definitive answer, how- ever there are still some methods which may produce some solutions. We used lattice reduction algorithms, which have been previously shown to solve similar problems (for example in [25]). We converted our set of equations to a lattice and then used the LLL and BKZ al- gorithms to find the shortest vectors, which in some situations could correspond to the pseudoprimes.

4.5.3 Lattice for Fermat pseudoprimes

To describe the method, we will start with only the equations necessary for n being a Fermat pseudoprime. For each generator ai and prime pj we have the values of logai pj. We may write these values in a matrix

X = (logai pj)j,i, where the rows correspond to the primes and the columns correspond to the generators. The rows are vectors over Z and their linear combinations over Z form a lattice. We are looking for a linear combination of the rows such that each row is included at most once and the result is a multiple of the vector (ord(ai))i. To find the combination, we extend the matrix to   1 0  .   X ..       0 1     ord(a1) 0     .   .. 0  0 ord(ak)

This matrix forms the basis of the new lattice. The bottom left part was added because we want to compute each column modulo the order of the group. The upper right part was added so that we know how many times each row was added in the combination. A vector in the lattice with k leading zeros followed by sequence of zeroes or ones corresponds to a pseudoprime (or prime if it contains only one one).

42 4. Search for challenge psudoprimes

Such vectors are relatively small, and so to try to find them we may apply the LLL or BKZ algorithm, which returns a reduced basis with small vectors. To make sure the left part is reduced to zeros, we may multiply it by some large constant c (for example c = 1000). As a proof of concept of this method, we have constructed a system of 77 primes, which contains a Fermat pseudoprime. We have started with a small system of 9 primes, such that their product is a Fermat pseudoprime, and then we added another 68 primes such that the congruences 4.1 still hold. In this system, the method with the BKZ algorithm finds the pseudoprime within a few seconds, which is much faster than trying all the 277 possible integers. However, in general, the method is highly unreliable and unlikely to discover the pseudoprime. The main problem is that the LLL and BKZ algorithms find only an approximately smallest basis, as the problem to find the actual smallest basis is again NP-complete. So, the larger the input matrix, the more likely it is the reduced basis vectors will not contain only zeros and ones but also other integers, which makes it less likely the method will produce a pseudoprime.

4.5.4 Example for Fermat pseudoprimes

We present a small example of the method for Fermat pseudoprimes. Let P = {7, 13, 23, 31, 41} be the system of primes. It contains Fermat pseudoprime n = 7 · 23 · 41. We set M = 660 and c = 10. Then the matrix of the lattice basis as described above looks as follows.

  10 0 10 70 1 0 0 0 0    0 0 30 10 0 1 0 0 0     10 10 30 0 0 0 1 0 0       10 0 0 60 0 0 0 1 0     0 10 0 30 0 0 0 0 1       20 0 0 0 0 0 0 0 0     0 20 0 0 0 0 0 0 0       0 0 40 0 0 0 0 0 0  0 0 0 100 0 0 0 0 0

43 4. Search for challenge psudoprimes

To find the pseudoprime, we apply the BKZ algorithm (implemented in SageMath) and get the following result.   0 0 0 0 −1 0 −1 0 −1    0 0 0 0 0 1 −1 1 1     0 0 0 0 1 −1 −2 −1 0       0 0 0 0 1 2 −1 −2 1     0 0 0 0 −1 0 −1 −2 3       10 0 0 0 −1 0 −1 −1 1     0 0 −10 −10 −1 0 0 1 0       0 10 −10 0 −1 0 0 −1 1  0 10 10 0 1 0 0 1 −1 As we can see, the top left part is completely zeroed out. This means that the top right part tells us the possible combinations of the primes. We are looking for a square-free product of the primes, i.e. row with only zeroes and ones in the right part. The first row contains only zeroes and negative ones, and so we can multiply it by −1 and get the solution.

4.5.5 Lattice for both Fermat and Lucas pseudoprimes

To use this method to look also for Lucas pseudoprimes, we need to extend the matrix to   1 0    cX cY .. 0   .     0 1     · ( )   c ord a1 0   .   .. 0 0 0       0 c · ord(ak)     c · (b ) 0   ord 1   .   0 .. 0 0     0 c · (b )   ord l  2 0 logb1 z1 ... logbl zl 0 c

44 4. Search for challenge psudoprimes = ( ) where Y logbi pj j,i. First we added the columns for the matrix Y (multiplied by c), which corresponds to the equations necessary for n being a Lucas pseudoprime. The columns also contain the order c · ord(bi) of the generators as in the case of only Fermat pseudoprimes. We also had to add the last row containing the vector of k zeros fol-

lowed by logbi zi. To make sure this last row is not used by the LLL or BKZ algorithms more than once, we also added a large constant c2 at the bottom right. After the reduction, a row corresponds to a challenge pseudoprime if it contains k + l leading zeroes followed by a sequence of zeros or ones and ends with c2. We have implemented this method using the SageMath Python LLL and BKZ functions and applied it to the system of the 1248 primes. Unfortunately, even though the algorithms are polynomial, the matrix is still too large and the computation never finished. For this method to be effective the system needs to contain only about 100 primes, and 2|P| at the same time the expected number of pseudoprimes ϕ(MN) needs to be sufficiently large. The search for such a system may be a possible direction for further research into PSW-pseudoprimes.

45

Conclusion

In this thesis we examined the PSW-challenge pseudoprimes. We summarized the known relevant theory of the Fermat and the Lucas pseudoprimes and completed some missing lemmas and proofs. We described some algorithms which are often omitted in the literature, namely the algorithm to compute the rank of appearance. We also examined admissibility, which is the main necessary condition which all PSW-pseudoprimes must satisfy. We gave a new algorithm to test admissibility of integers, which is for individual integers faster than computation of the order, the rank and their greatest common divisor. In the last chapter we searched for the pseudoprimes in several special cases. Firstly, we examined pseudoprimes with small prime divisors. We tested integers with prime divisors lower than 1400 and found that none of them are PSW-challenge pseudoprimes. Next, we − looked for pseudoprimes in the form 2n 1 − 1. Primes dividing such pseudoprimes must have rank equal to a power of two. We found 28 such admissible primes, we constructed square-free products of the first 19 primes and verified that none of these products areaPSW- challenge pseudoprime. We also examined pseudoprimes with even number of prime divi- sors. These pseudoprimes have to be divisible by an admissible prime p with e(p) = 1. Previously, seven such primes were known [18]. We tested all primes up to 245 and found three other suitable primes. Then, we implemented algorithms to search for integers k such that n = pk are pseudoprimes. We found no such integers k, and so we have shown that up to 270 there are no PSW-pseudoprimes with even number of prime divisors. Finally, we examined the system of primes given by Chen and Greene [19]. We transformed the problem of finding the pseudoprimes in the system to a multidimensional modular Knapsack problem. We attempted to solve this problem by implementing known heuristic method of further transforming the problem into finding short vectors in a lattice. Unfortunately, the system of primes was too large for this method. However, Chen and Greene conducted their research in 2000, so with greater computational resources a smaller system might be

47 4. Search for challenge psudoprimes found. Other methods for finding pseudoprimes as solutions to this Knapsack problem may be a possible avenue for further reaserch. Another potential direction for further reaserch is examining pseud- primes for different base a and parameters P, Q. We examined the weaker version of the Baillie-PSW test with parameters P = 1, Q = −1, which allowed us to use the condition of admissibility. However, ad- missibility can be defined for any parameters a, P, Q, and so the meth- ods described in this text can be applied to pseudoprimes with other parameters.

48 Bibliography

1. POMERANCE, Carl; SELFRIDGE, John L; WAGSTAFF, Samuel S. The pseudoprimes to 25 · 109. Mathematics of Computation. 1980, vol. 35, no. 151, pp. 1003–1026. 2. ALFORD, William R; GRANVILLE, Andrew; POMERANCE, Carl. There are infinitely many Carmichael numbers. Annals of Mathematics. 1994, pp. 703–722. 3. LEHMER, Derrick H. Strong Carmichael numbers. Journal of the Australian Mathematical Society. 1976, vol. 21, no. 4, pp. 508–510. 4. MENEZES, Alfred J; VAN OORSCHOT, Paul C; VANSTONE, Scott A. Handbook of applied cryptography. CRC press, 2018. 5. PrimeGrid. Available also from: https://www.primegrid.com/ forum_thread.php?id=9436. 6. BAILLIE, Robert; WAGSTAFF, Samuel S. Lucas pseudoprimes. Mathematics of Computation. 1980, vol. 35, no. 152, pp. 1391–1417. 7. JOYE, Marc; QUISQUATER, J-J. Efficient computation of full Lucas sequences. Electronics Letters. 1996, vol. 32, no. 6, pp. 537– 538. 8. KOVAL, Aleksey et al. On Lucas Sequences Computation. IJCNS. 2010, vol. 3, no. 12, pp. 943–944. 9. RENAULT, Marc. The period, rank, and order of the (a, b)- Fibonacci sequence mod m. Mathematics Magazine. 2013, vol. 86, no. 5, pp. 372–380. 10. Library (NTL). Available also from: https : / / libntl.org/. 11. LUCAS, Edouard. Théorie des fonctions numériques simplement périodiques. American Journal of Mathematics. 1878, pp. 289–321. 12. LEHMER, Derrick Henry. An extended theory of Lucas’ func- tions. Annals of Mathematics. 1930, pp. 419–448. 13. WILLIAMS, Hugh C. On numbers analogous to the Carmichael numbers. Canadian Mathematical Bulletin. 1977, vol. 20, no. 1, pp. 133–143.

49 BIBLIOGRAPHY

14. CARMICHAEL, Robert D. On the numerical factors of the arith- metic forms α n±β n. The Annals of Mathematics. 1913, vol. 15, no. 1/4, pp. 49–70. 15. GRANTHAM, Jon. Frobenius pseudoprimes. Mathematics of com- putation. 2001, vol. 70, no. 234, pp. 873–891. 16. GILCHRIST, Jeff. Pseudoprime Enumeration with Probabilistic Pri- mality Tests. Available also from: http://gilchrist.ca/jeff/ factoring/pseudoprimes.html. 17. POMERANCE, Carl. Are there counter-examples to the Baillie- PSW primality test. Dopo Le Parole aangeboden aan Dr. AK Lenstra. Privately published Amsterdam. 1984. 18. SHALLUE, Andrew; WEBSTER, Jonathan. Fast tabulation of chal- lenge pseudoprimes. The Open Book . 2019, vol. 2, no. 1, pp. 411–423. 19. CHEN, Zhuo; GREENE, John. Some comments on Baillie-PSW pseudoprimes. Fibonacci Quarterly. 2003, vol. 41, no. 4, pp. 334– 344. 20. CHEN, Zhuo; GREENE, John. Available also from: https://www. d.umn.edu/~jgreene/baillie/Baillie-PSW.html. 21. List of known Mersenne primes. Available also from: https://www. mersenne.org/primes/. 22. Integer factorization calculator. Available also from: https://www. alpertron.com.ar/ECM.HTM. 23. WALISCH, Kim. primesieve. Available also from: https://github. com/kimwalisch/primesieve. 24. Metacentrum. Available also from: https://metavo.metacentrum. cz/. 25. PLANTARD, Thomas; SUSILO, Willy; ZHANG, Zhenfei. Lattice reduction for modular knapsack. In: International Conference on Selected Areas in Cryptography. 2012, pp. 275–286.

50