Twenty Years of Attacks on the RSA

Dan Boneh

dab [email protected]

1 Intro duction

The RSA cryptosystem, invented by , , and Len Adleman [21 ], was rst

publicized in the August 1977 issue of Scienti c American. The cryptosystem is most commonly

used for providing privacy and ensuring authenticity of digital data. These days RSA is deployed

in many commercial systems. It is used by web servers and browsers to secure web trac, it is

used to ensure privacy and authenticity of Email, it is used to secure remote login sessions, and

it is at the heart of electronic credit-card payment systems. In short, RSA is frequently used in

applications where security of digital data is a concern.

Since its initial publication, the RSA system has b een analyzed for vulnerability by many

researchers. Although twentyyears of researchhave led to a numb er of fascinating attacks, none of

them is devastating. They mostly illustrate the dangers of improp er use of RSA. Indeed, securely

implementing RSA is a nontrivial task. Our goal is to survey some of these attacks and describ e

the underlying mathematical to ols they use. Throughout the survey we follow standard naming

conventions and use to denote two generic parties wishing to communicate with

each other. We use Marvin to denote a malicious attacker wishing to eavesdrop or tamp er with the

communication b etween Alice and Bob.

We b egin by describing a simpli ed version of RSA . Let N = pq b e the pro duct of

two large primes of the same size n=2 bits each. A typical size for N is n = 1024 bits, i.e. 309

decimal digits. Each of the factors is 512 bits. Let e; d be twointegers satisfying ed =1mod'N 



where 'N  = p 1q 1 is the order of the multiplicative group Z . We call N the RSA

N

modulus, e the encryption exponent, and d the decryption exponent. The pair hN; ei is the public

. As its name suggests, it is public and is used to encrypt messages. The pair hN; di is called

the secret key or private key and is known only to the recipient of encrypted messages. The secret

key enables decryption of .

 e

A message is an integer M 2 Z . To encrypt M , one computes C = M mo d N . To decrypt

N

d

the , the legitimate receiver computes C mo d N . Indeed,

d ed

C = M = M mo d N ;

1

where the last equality follows by Euler's theorem . One de nes the RSA function as x 7!

e

x mo d N . If d is given, the function can b e easily inverted using the ab ove equality. We refer to d

as a trapdoor enabling one to invert the function. In this survey we study the dicultyofinverting

1

Our description slightly oversimpli es RSA encryption. In practice, messages are padded prior to encryption

using some randomness [1]. For instance, a simple but insucient algorithm may pad a M by

app ending a few random bits to one of the ends prior to encryption. Adding randomness to the encryption pro cess

is necessary for prop er security. 1

the RSA function without the trap do or. We refer to this as breaking RSA. More precisely, given

th

the triple hN ; e; C i, we ask how hard is it to compute the e ro ot of C mo dulo N = pq when the

 

of N is unknown. Since Z is a nite set, one mayenumerate all elements of Z until

N N

the correct M is found. Unfortunately, this results in an algorithm with running time of order N ,

namely exponential in the size of its input, which is of the order log N . We are interested mostly

2

c

in algorithms with a substantially lower running time, namely on the order of n where n = log N

2

and c is some small constant less than 5, say. Such algorithms often p erform well in practice on

the inputs in question. Throughout the pap er we refer to such algorithms as ecient.

In this survey we mainly study the RSA function as opp osed to the RSA cryptosystem. Lo osely

sp eaking, the dicultyofinverting the RSA function on random inputs implies that given hN ; e; C i

an attacker cannot recover the plaintext M . However, a cryptosystem must resist more subtle

attacks. If hN ; e; C i is given, it should b e intractable to recover any information ab out M . This is

2

known as . We do not discuss these subtle attacks, but p oint out that RSA as

describ ed ab ove is not semantically secure: given hN ; e; C i, one can easily deduce some information

ab out the plaintext M for instance, the Jacobi symbol of M over N can b e easily deduced from

C . RSA can b e made semantically secure by adding randomness to the encryption pro cess [1].

e

The RSA function x 7! x mo d N is an example of a trapdoor one-way function. It can b e easily

computed, but as far as we know cannot b e eciently inverted without the trap do or d except in

sp ecial circumstances. Trap do or one-way functions can b e used for digital signatures [19 ]. Digital

signatures provide authenticity and nonrepudiation of electronic legal do cuments. For instance,



they are used for signing digital checks or electronic purchase orders. To sign a message M 2 Z

N

d

using RSA, Alice applies her private key hN; di to M and obtains a signature S = M mo d N .

e

Given hM; Si, anyone can verify Alice's signature on M by checking that S = M mo d N . Since

only Alice can generate S , one may susp ect that an adversary cannot forge Alice's signature.

Unfortunately, things are not so simple; extra measures are needed for prop er security. Digital

signatures are an imp ortant application of RSA. Some of the attacks we survey sp eci cally target

RSA digital signatures.

n

-bit primes and multiplying them to An RSA key pair is generated by picking two random

2

1

obtain N . Then, for a given encryption exp onent e < 'N , one computes d = e mo d 'N 

using the extended . Since the set of primes is suciently dense, a random

n n

-bit prime can b e quickly generated by rep eatedly picking random -bit integers and testing each

2 2

one for primality using a probabilistic [19 ].

1.1 Factoring Large Integers

The rst attack on an RSA public key hN; ei to consider is factoring the mo dulus N . Given the

factorization of N , an attacker can easily construct 'N , from which the decryption exp onent

1

d = e mo d 'N  can be found. We refer to factoring the mo dulus as a brute-force attack on

RSA. Although factoring algorithms have b een steadily improving, the current state of the art is

still far from p osing a threat to the security of RSA when RSA is used prop erly. Factoring large

integers is one of the most b eautiful problems of computational mathematics [18 , 20 ], but it is not

the topic of this article. For completeness we note that the current fastest factoring algorithm is

2=3

1=3

log n the General Numb er Field Sieve. Its running time on n-bit integers is exp c + o1n

for some c<2. Attacks on RSA that take longer than this time b ound are not interesting. These

2

A source that explains semantic security and gives examples of semantically secure is [11 ]. 2

include attacks such as exhaustive search for M and some older attacks published right after the

initial publication of RSA.

Our ob jective is to survey attacks on RSA that decrypt messages without directly factoring the

RSA mo dulus N . Nevertheless, it is worth noting that some sparse sets of RSA mo duli, N = pq ,

can b e easily factored. For instance, if p 1 is a pro duct of prime factors less than B , then N can

3

b e factored in time less than B . Some implementations explicitly reject primes p for which p 1

is a pro duct of small primes.

As noted ab ove, if an ecient factoring algorithm exists, then RSA is insecure. The converse is

th

a long standing op en problem: must one factor N in order to eciently compute e ro ots mo dulo

N ? Is breaking RSA as hard as factoring? We state the concrete op en problem b elow.

Op en Problem 1 Given integers N and e satisfying gcde; 'N  = 1, de ne the function f :

e;N

  1=e

Z ! Z by f x = x mo d N . Is there a polynomial-time algorithm A that computes the

e;N

N N

factorization of N given N and access to an \oracle" f x for some e?

e;N

An oracle for f x evaluates the function on any input x in unit time. Recently Boneh and

Venkatesan [6] provided evidence that for small e the answer to the ab ove problem may b e \no". In

other words, for small e there may not exist a p olynomial-time reduction from factoring to breaking

RSA. They do so by showing that in a certain mo del, a p ositive answer to the problem for small

e yields an ecient factoring algorithm. We note that a p ositive answer to Op en Problem 1 gives

3

rise to a \chosen ciphertext attack" on RSA. Therefore, a negative answer maybewelcome.

Next we show that exp osing the private key d and factoring N are equivalent. Hence there is

no p oint in hiding the factorization of N from any party who knows d.

Fact 1 Let hN; ei be an RSA public key. Given the private key d, one can eciently factor the

modulus N = pq . Conversely, given the factorization of N , one can eciently recover d.

Pro of A factorization of N yields 'N . Since e is known, one can recover d. This proves the

converse statement. Wenow show that given d one can factor N . Given d, compute k = de 1. By

t

de nition of d and e we know that k is a multiple of 'N . Since 'N iseven, k =2 r with r odd

k  k=2

and t  1. Wehave g = 1 for every g 2 Z , and therefore g is a square ro ot of unity mo dulo N .

N

By the Chinese Remainder Theorem, 1 has four square ro ots mo dulo N = pq . Two of these square

ro ots are 1. The other two are x where x satis es x = 1 mo d p and x = 1modq . Using either

one of these last two square ro ots, the factorization of N is revealed by computing gcdx 1;N.



A straightforward argument shows that if g is chosen at random from Z then with probability

N

t

k=2 k=4 k=2

at least 1=2 over the choice of g  one of the elements in the sequence g ;g ;::: ;g mo d N

is a square ro ot of unity that reveals the factorization of N . All elements in the sequence can be

3

eciently computed in time O n  where n = log N . 

2

2 Elementary Attacks

We b egin by describing some old elementary attacks. These attacks illustrate blatant misuse of

RSA. Although many such attacks exist, we give only two examples.

3

In this context, \chosen ciphertext attack" refers to an attacker Marvin, who is given a public key hN; ei and

access to a blackbox that decrypts messages of his choice. Marvin succeeds in mounting the chosen ciphertext attack

if by using the blackbox he can recover the private key hN; di. 3

2.1 Common Mo dulus

To avoid generating a di erent mo dulus N = pq for each user one may wish to x N once and

for all. The same N is used by all users. A trusted central authority could provide user i with a

unique pair e ;d from which user i forms a public key hN; e i and a secret key hN; d i.

i i i i

e

a

mo d N intended for Alice cannot At rst glance this may seem to work: a ciphertext C = M

be decrypted by Bob since Bob do es not p ossess d . However, this is incorrect and the resulting

a

system is insecure. By Fact 1 Bob can use his own exp onents e ;d to factor the mo dulus N . Once

b b

N is factored Bob can recover Alice's private key d from her public key e . This observation, due

a a

to Simmons, shows that an RSA mo dulus should never b e used by more than one entity.

2.2 Blinding

Let hN; di b e Bob's private key and hN; ei b e his corresp onding public key. Supp ose an adversary



Marvin wants Bob's signature on a message M 2 Z . Being no fo ol, Bob refuses to sign M . Marvin

N

 0 e

can try the following: he picks a random r 2 Z and sets M = r M mo d N . He then asks Bob

N

0 0

to sign the random message M . Bob may b e willing to provide his signature S on the inno cent-

0 0 0 d 0

lo oking M . But recall that S =M  mo d N . Marvin now simply computes S = S =r mo d N

and obtains Bob's signature S on the original M . Indeed,

e 0 e e 0 ed e 0 e

S =S  =r =M  =r M =r = M mo d N 

This technique, called blinding, enables Marvin to obtain a valid signature on a message of his

choice by asking Bob to sign a random \blinded" message. Bob has no information as to what

message he is actually signing. Since most signature schemes apply a \one-way hash" to the message

M prior to signing [19 ], the attack is not a serious concern. Although we presented blinding as

an attack, it is actually a useful prop erty of RSA needed for implementing anonymous digital cash

cash that can be used to purchase go o ds, but do es not reveal the identity of the p erson making

the purchase.

3 Low Private Exp onent

To reduce decryption time or signature-generation time, one may wish to use a small value of d

rather than a random d. Since mo dular exp onentiation takes time linear in log d, a small d can

2

improve p erformance by at least a factor of 10 for a 1024 bit mo dulus. Unfortunately, a clever

attack due to M. Wiener [22 ] shows that a small d results in a total break of the cryptosystem.

1

1=4

N . Given hN; ei with Theorem 2 M. Wiener Let N = pq with q < p < 2q . Let d <

3

ed = 1 mo d 'N , Marvin can eciently recover d.

Pro of The pro of is based on approximations using continued fractions. Since ed = 1 mo d 'N ,

there exists a k such that ed k'N =1. Therefore,

e 1 k

= :

'N  d d'N 

e k

. Although Marvin do es not know 'N , he may use N to is an approximation of Hence,

d

'N 

p p

approximate it. Indeed, since 'N =N p q + 1 and p + q 1 < 3 N ,wehave jN 'N j < 3 N . 4

Using N in place of 'N , we obtain:

k ed k'N  kN + k'N  e

=

N d Nd

p

3k 3k N 1 k N 'N 

p

:  = =

Nd Nd

d N

1

1=4

N . Hence we obtain: Now, k'N =ed 1

3

1 e k 1

< : 

2

1=4

N d 2d

dN

k e

This is a classic approximation relation. The numb er of fractions with d

d N

closely is b ounded by log N . In fact, all such fractions are obtained as convergents of the continued

2

e

fraction expansion of [12 , Th. 177]. All one has to do is compute the log N convergents of the

N

k e

. One of these will equal . Since ed k'N  = 1, wehave gcdk; d=1, continued fraction for

N d

k

and hence is a reduced fraction. This is a linear-time algorithm for recovering the secret key d.

d



Since typically N is 1024 bits, it follows that d must b e at least 256 bits long in order to avoid

this attack. This is unfortunate for low-p ower devices such as \smartcards", where a small d would

result in big savings. All is not lost however. Wiener describ es a numberoftechniques that enable

fast decryption and are not susceptible to his attack:

0

Large e: Supp ose instead of reducing e mo dulo 'N , one uses hN; e i for the public key, where

0 0

e = e + t  'N  for some large t. Clearly e can b e used in place of e for message encryption.

However, when a large value of e is used, the k in the ab ove pro of is no longer small. A simple

0 1:5

calculation shows that if e >N then no matter how small d is, the ab ove attack cannot

b e mounted. Unfortunately, large values of e result in increased encryption time.

Using CRT: An alternate approach is to use the Chinese Remainder Theorem CRT. Supp ose

one cho oses d such that b oth d = d mo d p 1 and d = d mo d q 1 are small, say

p q

128 bits each. Then fast decryption of a ciphertext C can be carried out as follows: rst

d d

p q

compute M = C mo d p and M = C mo d q . Then use the CRTto compute the unique

p q

value M 2 Z satisfying M = M mo d p and M = M mo d q . The resulting M satis es

N p q

d

M = C mo d N as required. The p oint is that although d and d are small, the value of

p q

d mo d 'N  can be large, i.e., on the order of 'N . As a result, the attack of Theorem 2

do es not apply. We note that if hN; ei is given, there exists an attack enabling an adversary



p p

d ; d  . Hence, d and d cannot b e made to o small. to factor N in time O min

p q p q

We do not know whether either of these metho ds is secure. All we know is that Wiener's attack

is ine ective against them. Theorem 2 was recently improved by Boneh and Durfee [4], who show

0:292

that as long as d

0:5

that Wiener's b ound is not tight. It is likely that the correct b ound is d

this writing, this is an op en problem.

0:5

Op en Problem 2 Let N = pq and d < N . If Marvin is given hN; ei with ed = 1mod'N 

and e<'N , can he eciently recover d? 5

4 Low Public Exp onent

To reduce encryption or signature-veri cation time, it is customary to use a small public exp onent

16

e. The smallest p ossible value for e is 3, but to defeat certain attacks the value e =2 + 1 = 65537

16

is recommended. When the value 2 + 1 is used, signature veri cation requires 17 multiplications,

as opp osed to roughly 1000 when a random e  'N  is used. Unlike the attack of the previous

section, attacks that apply when a small e is used are far from a total break.

4.1 Copp ersmith's Theorem

The most powerful attacks on low public exp onent RSA are based on a theorem due to Copp er-

smith [7]. Copp ersmith's theorem has many applications, only some of which will b e covered here.

The pro of uses the LLL lattice basis reduction algorithm [17 ] as explained b elow.

Theorem 3 Copp ersmith Let N be an integer and f 2 Z[x] be a monic polynomial of degree

1



d

d. Set X = N for some   0. Then, given hN; f i Marvin can eciently nd al l integers

jx j

0 0

the LLL algorithm on a lattice of dimension O w  with w = min1=; log N .

2

The theorem provides an algorithm for eciently nding all ro ots of f mo dulo N that are

1=d

less than X = N . As X gets smaller, the algorithm's running time decreases. The theorem's

strength is its ability to nd small ro ots of p olynomials mo dulo a comp osite N . When working

mo dulo a prime, there is no reason to use Copp ersmith's theorem since other, far b etter, ro ot- nding

algorithms exist.

We sketch the main ideas b ehind the pro of of Copp ersmith's theorem. We follow a simpli ed

P

i 2

approach due to Howgrave-Graham [14 ]. Given a p olynomial hx= a x 2 Z[x], de ne khk =

i

P

2

ja j . The pro of relies on the following observation.

i

i

Lemma 4 Let hx 2 Z[x] be a polynomial of degree d and let X be a positive integer. Suppose

p

khxX k

0 0 0

Pro of Observe from the Schwarz inequality that

X X X

i i

x x

0 0

i i i

= a X jhx j = a x  a X

i 0 i i

0

X X

X

p

i

 a X d khxX k

i

Since hx =0modN ,we conclude that hx =0. 

0 0

The lemma states that if h is a p olynomial with low norm, then all small ro ots of h mo d N are

also ro ots of h over the integers. The lemma suggests that to nd a small ro ot x of f xmodN we

0

should lo ok for another p olynomial h 2 Z[x] with small norm having the same ro ots as f mo dulo

N . Then x will b e a ro ot of h over the integers and can b e easily found. To do so, wemay search

0

for a p olynomial g 2 Z[x] such that h = gf has low norm, i.e., norm less than N . This amounts to

2 r

searching for an integer linear combination of the p olynomials f; xf; x f; ::: ;x f with low norm.

Unfortunately, most often there is no nontrivial linear combination with suciently small norm. 6

k k

Copp ersmith found a trick to solve the problem: if f x  =0modN , then f x  = 0 mo d N

0 0

for any k . More generally, de ne the following p olynomials:

mv u v

g x=N x f x

u;v

m

for some prede ned m. Then x is a ro ot of g x mo dulo N for any u  0 and 0  v  m.

0 u;v

To use Lemma 4 wemust nd an integer linear combination hx of the p olynomials g x such

u;v

m 1=d

that hxX  has norm less than N recall that X is an upp er b ound on x satisfying X  N .

0

m

Thanks to the relaxed upp er b ound on the norm N rather than N , one can show that for

suciently large m, there always exists a linear combination hx satisfying the required b ound.

Once hx is found, Lemma 4 implies that it has x as a ro ot over the integers. Consequently x

0 0

can b e easily found.

It remains to showhowto nd hx eciently. To do so, we must rst state a few basic facts

w w

ab out lattices in Z . We refer to [17 ] for a concise intro duction to the topic. Let u ;::: ;u 2 Z

1 w

be linearly indep endent vectors. A full-rank lattice L spanned by hu ;::: ;u i is the set of all

1 w

integer linear combinations of u ;::: ;u . The determinant of L is de ned as the determinant of

1 w

the w  w square matrix whose rows are the vectors u ;::: ;u .

1 w

In our case, we view the p olynomials g xX  as vectors and study the lattice L spanned by

u;v

them. We let v =0;::: ;m and u =0;::: ;d 1, and hence the lattice has dimension w = dm + 1.

For example, when f is a quadratic monic p olynomial and m = 3, the resulting lattice is spanned

by the rows of the following matrix:

2 3 4 5 6 7

1 x x x x x x x

3 2

3

N g xX 

0;0

3

7 6

XN g xX 

1;0

7 6

2 2

7 6

  X N g xX 

0;1

7 6

3 2

7 6

  X N g xX 

1;1

7 6

4

7 6

    X N g xX 

0;2

7 6

5

7 6

    X N g xX 

1;2

7 6

6

5 4

      X g xX 

0;3

7

      X g xX 

1;3

The entries  corresp ond to co ecients of the p olynomials whose value we ignore. All empty

entries are zero. Since the matrix is triangular, its determinantisthe pro duct of the elements on

the diagonal which are explicitly given ab ove. Our ob jective is to nd short vectors in this lattice.

A classic result of Hermite states that any lattice L of dimension w contains a nonzero p oint

1=w

v 2 L whose L norm satis es kv k  det L , where is a constant dep ending only on w .

2 w w

Hermite's b ound can b e used to show that for large enough m our lattice contains vectors of norm

m

less than N , as required. The question is whether we can eciently construct a short vector in

L whose length is not much larger than the Hermite b ound. The LLL algorithm is an ecient

algorithm that do es precisely that.

Fact 5 LLL Let L be a lattice spanned by hu ;::: ;u i. When hu ;::: ;u i are given as input,

1 w 1 w

then the LLL algorithm outputs a point v 2 L satisfying

w=4 1=w

kv k2 detL :

The running time for LLL is quartic in the length of the input. 7

The LLL algorithm named after its inventors L. Lovasz, A. Lenstra, and H. Lenstra Jr. has

many applications in b oth computational numb er theory and cryptography. Its discovery in 1982

provided an ecient algorithm for factoring p olynomials over the integers and, more generally,

over number rings. LLL is frequently used to attack various . For instance, many

cryptosystems based on the \" have b een broken using LLL.

Using LLL, we can complete the pro of of Copp ersmith's theorem. To ensure that the vector

pro duced by LLL satis es the b ound of Lemma 4 we need

p

w=4 1=w m

w; 2 detL

where w = dm + 1 is the dimension of L. A routine calculation shows that for large enough m the

1

 1

d

b ound is satis ed. Indeed, when X = N ; log N . , it suces to take m = O k=d with k = min



Consequently, the running time is dominated by running LLL on a lattice of dimension O k , as

required.

A natural question is whether Copp ersmith's theorem can be applied to bivariate and multi-

variate p olynomials. If f x; y  2 Z [x; y ] is given for which there exists a ro ot x ;y  with jx y j

N 0 0 0 0

suitably b ounded, can Marvin eciently nd x ;y ? Although the same technique app ears to

0 0

work for some bivariate p olynomials, it is currently an op en problem to prove it. As an increasing

numb er of results dep end on a bivariate extension of Copp ersmith's theorem, a rigorous algorithm

will b e very useful.

Op en Problem 3 Find general conditions under which Coppersmith's theorem can be generalized

to bivariate polynomials.

4.2 Hastad's Broadcast Attack

As a rst application of Copp ersmith's theorem, we present an improvement to an old attack due

to Hastad [13 ]. Supp ose Bob wishes to send an encrypted message M to a number of parties

P ;P ;::: ;P . Each party has its own RSA key hN ;e i. We assume M is less than all the N 's.

1 2 i i i

k

th

Naively, to send M , Bob encrypts it using each of the public keys and sends out the i ciphertext

to P . An attacker Marvin can eavesdrop on the connection out of Bob's sight and collect the k

i

transmitted ciphertexts.

For simplicity, supp ose all public exp onents e are equal to 3. A simple argument shows that

i

Marvin can recover M if k  3. Indeed, Marvin obtains C ;C ;C , where

1 2 3

3 3 3

C = M mo d N ; C = M mo d N ; C = M mo d N :

1 1 2 2 3 3

We may assume that gcd N ;N =1 for all i 6= j since otherwise Marvin can factor some of the

i j

0

N 's. Hence, applying the Chinese Remainder Theorem CRT to C ;C ;C gives a C 2 Z

i 1 2 3 N N N

1 2 3

0 3 3

satisfying C = M mo d N N N . Since M is less than all the N 's, wehave M

1 2 3 i 1 2 3

0 3

C = M holds over the integers. Thus, Marvin may recover M by computing the real cub e ro ot of

0

C . More generally, if all public exp onents are equal to e, Marvin can recover M as so on as k  e.

The attack is feasible only when a small e is used.

Hastad [13 ] describ es a far stronger attack. To motivate Hastad's result, consider a naive defense

against the ab ove attack. Rather than broadcasting the encryption of M , Bob could \pad" the

m

message prior to encryption. For instance, if M is m bits long, Bob could send M = i2 + M

i

to party P . Since Marvin obtains of di erent messages, he cannot mount the attack.

i 8

Unfortunately, Hastad showed that this linear padding is insecure. In fact, he proved that applying

any xed p olynomial to the message prior to encryption do es not prevent the attack.

Supp ose that for each of the participants P ;::: ;P , Bob has a xed public p olynomial f 2

1 i

k

Z [x]. To broadcast a message M , Bob sends the encryption of f M  to party P . By eavesdrop-

N i i

i

e

i

ping, Marvin learns C = f M  mo d N for i =1;::: ;k. Hastad showed that if enough parties

i i i

are involved, Marvin can recover the plaintext M from all the ciphertexts. The following theorem

is a stronger version of Hastad's original result.

Theorem 6 Hastad Let N ;::: N bepairwise relatively prime integers and set N = min N .

1 min i i

k

Let g 2 Z [x] be k polynomials of maximum degree d. Suppose there exists a unique M

i N min

i

satisfying

g M  =0modN for al l i =1;::: ;k:

i i

k

Under the assumption that k>d, one can eciently nd M given hN ;g i .

i i

i=1

Pro of Let N = N  N . We may assume that all g 's are monic. Indeed if, for some i,

1 i

k



the leading co ecient of g is not invertible in Z , then the factorization of N is exp osed. By

i i

N

i

multiplying each g by the appropriate p ower of x,wemay assume they all have degree d. Construct

i

the p olynomial

k

X

1modN if i = j

j

g x= T g x; where T =

i i i

0modN if i 6= j:

j

i=1

The T 's are integers known as the Chinese Remainder Co ecients. Then g x must be monic

i

N . since it is monic mo dulo all the N . Its degree is d. Furthermore, we know that g M  =0mod

i

1=k 1=d

Theorem 6 now follows from Theorem 3 since M

min

The theorem shows that a system of univariate equations mo dulo relatively prime comp osites

can be eciently solved, assuming suciently many equations are provided. By setting g =

i

e

i

C mo d N , we see that Marvin can recover M from the given ciphertexts whenever the f

i i

i

number of parties is at least d, where d is the maximum of e deg f  over all i = 1;::: ;k. In

i i

particular, if all e 's are equal to e and Bob sends out linearly related messages, then Marvin can

i

recover the plaintext as so on as k>e.

Hastad's original theorem is weaker than the one stated ab ove. Rather than d p olynomials,

Hastad required dd +1=2 p olynomials. Hastad's pro of is similar to the pro of of Copp ersmith's

theorem describ ed in the previous section. However, Hastad do es not use p owers of g in the lattice

and consequently obtains a weaker b ound.

To conclude this section we note that to prop erly defend against the broadcast attack ab ove,

one must use a randomized pad [1 ] rather than a xed one.

4.3 Franklin-Reiter Related Message Attack

Franklin and Reiter [8] found a clever attack when Bob sends Alice related encrypted messages



using the same mo dulus. Let hN; ei b e Alice's public key. Supp ose M ;M 2 Z are two distinct

1 2

N

messages satisfying M = f M modN for some publicly known p olynomial f 2 Z [x]. To send

1 2 N

M and M to Alice, Bob may naively encrypt the messages and transmit the resulting ciphertexts

1 2

C ;C . We show that given C ;C , Marvin can easily recover M ;M . Although the attackworks

1 2 1 2 1 2

for any small e,we state the following lemma for e = 3 in order to simplify the pro of. 9



Lemma 7 FR Set e = 3 and let hN; ei be an RSA public key. Let M 6= M 2 Z satisfy

1 2

N

M = f M modN for some linear polynomial f = ax + b 2 Z [x] with b 6= 0. Then, given

1 2 N

hN ; e; C ;C ;fi, Marvin can recover M ;M in time quadratic in log N .

1 2 1 2

Pro of To keep this part of the pro of general, we state it using an arbitrary e rather than

e

restricting to e = 3. Since C = M mo d N , we know that M is a ro ot of the p olynomial

1 2

1

e e

g x=f x C 2 Z [x]. Similarly, M isarootofg x=x C 2 Z [x]. The linear factor

1 1 N 2 2 2 N

4

x M divides b oth p olynomials. Therefore, Marvin may use the Euclidean algorithm to compute

2

the gcd of g and g . If the gcd turns out to b e linear, M is found. The gcd can b e computed in

1 2 2

quadratic time in e and log N .

3

We show that when e = 3 the gcd must be linear. The p olynomial x C factors mo dulo

2

b oth p and q into a linear factor and an irreducible quadratic factor recall that gcde; 'N  = 1

3

and hence x C has only one ro ot in Z . Since g cannot divide g , the gcd must be linear.

2 N 2 1

For e>3 the gcd is almost always linear. However, for some rare M ;M , and f , it is p ossible to

1 2

obtain a nonlinear gcd, in which case the attack fails. 

For e > 3 the attack takes time quadratic in e. Consequently, it can be applied only when a

small public exp onent e is used. For large e the work in computing the gcd is prohibitive. It is

an interesting question though likely to be dicult to devise such an attack for arbitrary e. In

particular, can the gcd of g and g ab ove b e found in time p olynomial in log e?

1 2

4.4 Copp ersmith's Short Pad Attack

The Franklin-Reiter attack might seem a bit arti cial. After all, why should Bob send Alice the

encryption of related messages? Copp ersmith strengthened the attack and proved an imp ortant

result on padding [7 ].

A naive random padding algorithm might pad a plaintext M by app ending a few random bits to

one of the ends. The following attack p oints out the danger of such simplistic padding. Supp ose Bob

sends a prop erly-padded encryption of M to Alice. An attacker, Marvin, intercepts the ciphertext

and prevents it from reaching its destination. Bob notices that Alice did not resp ond to his message

and decides to resend M to Alice. He randomly pads M and transmits the resulting ciphertext.

Marvin now has two ciphertexts corresp onding to two encryptions of the same message using two

di erent random pads. The following theorem shows that although he do es not know the pads

used, Marvin is able to recover the plaintext.

2 

Theorem 8 Let hN; ei be a public RSA key where N is n-bits long. Set m = bn=e c. Let M 2 Z

N

m m

be a message of length at most n m bits. De ne M =2 M + r and M =2 M + r , where

1 1 2 2

m

r and r are distinct integers with 0  r ;r < 2 . If Marvin is given hN; ei and the encryptions

1 2 1 2

C ;C of M ;M but is not given r or r , he can eciently recover M .

1 2 1 2 1 2

e e

Pro of De ne g x; y =x C and g x; y =x + y  C . We know that when y = r r ,

1 1 2 2 2 1

these p olynomials have M as a common ro ot. In other words,  = r r is a ro ot of the \resul-

1 2 1

2

2 m 1=e

tant" hy  = res g ;g  2 Z [y ]. The degree of h is at most e . Furthermore, jj < 2

x 1 2 N

Hence,  is a small ro ot of h mo dulo N , and Marvin can eciently nd it using Copp ersmith's

theorem Theorem 3. Once  is known, the Franklin-Reiter attack of the previous section can b e

4

Although Z [x] is not a Euclidean ring, the standard Euclidean algorithm can still b e applied to p olynomials in

N

Z [x]. One can show that if the algorithm \breaks" in anyway, then the factorization of N is exp osed.

N 10

used to recover M and consequently M . 

2

th

When e = 3 the attack can b e mounted as long as the pad length is less than 1=9 the message

length. This is an imp ortant result. Note that for the recommended value of e = 65537, the attack

is useless against standard mo duli sizes.

4.5 Partial Key Exp osure Attack

Let hN; di be a private RSA key. Supp ose by some means Marvin is able to exp ose a fraction of

the bits of d, say a quarter of them. Can he reconstruct the rest of d? Surprisingly, the answer

is p ositive when the corresp onding public key is small. Recently Boneh, Durfee, and Frankel [5 ]

p

N , it is p ossible to reconstruct all of d from just a fraction of its bits. showed that as long as e<

These results illustrate the imp ortance of safeguarding the entire private RSA key.

Theorem 9 BDF Let hN; di be a private RSA key in which N is n bits long. Given the dn=4e

least signi cant bits of d, Marvin can reconstruct al l of d in time linear in e log e.

2

The pro of relies on yet another b eautiful theorem due to Copp ersmith [7 ].

Theorem 10 Copp ersmith Let N = pq be an n-bit RSA modulus. Then given the n=4 least

signi cant bits of p or the n=4 most signi cant bits of p, one can eciently factor N .

Theorem 9 readily follows from Theorem 10. In fact, by de nition of e and d, there exists an

integer k such that

ed k N p q +1= 1:

n=4

Since d<'N , wemust have0

we obtain

n=4

edp kpN p +1+ kN = p mo d 2 :

n=4

Since Marvin is given the n=4 least signi cant bits of d, he knows the value of ed mo d 2 . Con-

sequently, he obtains an equation in k and p. For each of the e p ossible values of k , Marvin solves

n=4

the quadratic equation in p and obtains a number of candidate values for p mo d 2 . For each

of these candidate values, he runs the algorithm of Theorem 10 to attempt to factor N . One can

n=4

show that the total number of candidate values for p mo d 2 is at most e log e. Hence after at

2

most e log e attempts, N will b e factored.

2

Theorem 9 is known as a partial key-exposure attack. Similar attacks exist for larger values of

p

e as long as e < N . However the techniques are a bit more complex [5 ]. It is interesting that

discrete log-based cryptosystems, such as the ElGamal public key system, do not seem susceptible

x

to partial key exp osure. Indeed, if g mo d p and a constant fraction of the bits of x are given, there

is no known p olynomial-time algorithm to compute the rest of x.

To conclude the section we show that when the encryption exp onent e is small, the RSA system

leaks half the most signi cant bits of the corresp onding private key d. To see this, consider once

again the equation ed k N p q + 1 = 1 for an integer 0

compute

b

d = bkN +1=ec:

Then

p p

b

jd djk p + q =e  3k N=e < 3 N: 11

b

Hence, d is a go o d approximation for d. The b ound shows that, for most d, half the most signi cant

b

bits of d are equal to those of d. Since there are only e p ossible values for k , Marvin can construct

a small set of size e such that one of the elements in the set is equal to half the most signi cant

bits of d. The case e = 3 is esp ecially interesting. In this case one can show that always k = 2 and

hence the system completely leaks half the most signi cant bits of d.

5 Implementation Attacks

We turn our attention to an entirely di erent class of attacks. Rather than attacking the underlying

structure of the RSA function, these attacks fo cus on the implementation of RSA.

5.1 Timing Attacks

Consider a smartcard that stores a private RSA key. Since the card is tamp er resistant, an attacker

Marvin may not b e able to examine its contents and exp ose the key. However, a clever attack due

to Ko cher [16 ] shows that by precisely measuring the time it takes the smartcard to p erform an

RSA decryption or signature, Marvin can quickly discover the private decryption exp onent d.

We explain how to mount the attack against a simple implementation of RSA using the

\rep eated squaring algorithm". Let d = d d ::: d be the binary representation of d i.e.,

n n1 0

P

n

i d

2 d with d 2f0; 1g. The repeated squaring algorithm computes C = M mo d N , using d =

i i

i=0

Q

i

n

2 d

i

M mo d N . at most 2n mo dular multiplications. It is based on the observation that C =

i=0

The algorithm works as follows:

Set z equal to M and C equal to 1. For i =0;::: ;n, do these steps:

1 if d = 1 set C equal to C  z mo d N ,

i

2

2 set z equal to z mo d N .

d

At the end, C has the value M mo d N .

i

2

The variable z runs through the set of values M mo d N for i =0;::: ;n. The variable C \collects"

d

the appropriate p owers in the set to obtain M mo d N .

To mount the attack, Marvin asks the smartcard to generate signatures on a large number of



random messages M ;::: ;M 2 Z and measures the time T it takes the card to generate eachof

1 i

k

N

the signatures.

The attack recovers bits of d one at a time b eginning with the least signi cant bit. We know d is

2

o dd. Thus d =1. Consider the second iteration. Initially z = M mo d N and C = M . If d =1,

0 1

2

the smartcard computes the pro duct C  z = M  M mo d N . Otherwise, it do es not. Let t b e the

i

2

time it takes the smartcard to compute M  M mo d N . The t 's di er from each other since the

i i

i

2

time to compute M  M mo d N dep ends on the value of M simple mo dular reduction algorithms

i i

i

take a di erent amount of time dep ending on the value b eing reduced. Marvin measures the t 's

i

oine prior to mounting the attack once he obtains the physical sp eci cations of the card.

Ko cher observed that when d = 1, the two ensembles ft g and fT g are correlated. For instance,

1 i i

if, for some i; t is much larger than its exp ectation, then T is also likely to be larger than its

i i

exp ectation. On the other hand, if d = 0, the two ensembles ft g and fT g b ehave as indep endent

1 i i

random variables. By measuring the correlation, Marvin can determine whether d is 0 or 1.

1 12

Continuing in this way, he can recover d , d , and so on. Note that when a low public exp onent e

2 3

is used, the partial key exp osure attack of the previous section shows that Ko cher's

need only b e employed until a quarter of the bits of d are discovered.

There are two ways to defend against the attack. The simplest is to add appropriate delay so

that mo dular exp onentiation always takes a xed amount of time. The second approach, due to



Rivest, is based on blinding. Prior to decryption of M the smartcard picks a random r 2 Z and

N

0 e 0 0 0 d

computes M = M  r mo d N . It then applies d to M and obtains C =M  mo d N . Finally,

0

the smartcard sets C = C =r mo d N . With this approach, the smartcard is applying d to a random

0

message M unknown to Marvin. As a result, Marvin cannot mount the attack.

Ko cher recently discovered another attack along these lines called power . Ko cher

showed that by precisely measuring the smartcard's power consumption during signature gener-

ation, Marvin can often easily discover the secret key. As it turns out, during a multi-precision

multiplication the card's power consumption is higher than normal. By measuring the length of

high consumption p erio ds, Marvin can easily determine if in a given iteration the card p erforms

one or twomultiplications, thus exp osing the bits of d.

5.2 Random Faults

Implementations of RSA decryption and signatures frequently use the Chinese Remainder Theorem

d

to sp eed up the computation of M mo d N . Instead of working mo dulo N , the signer Bob rst

computes the signatures mo dulo p and q and then combines the results using the Chinese Remainder

Theorem. More precisely, Bob rst computes

d d

p q

C = M mo d p and C = M mo d q;

p q

where d = d mo d p 1 and d = d mo d q 1. He then obtains the signature C by setting

p q

C = T C + T C mo d N ;

1 p 2 q

where

   

1modp 0modp

T = and T = :

1 2

0modq 1modq

The running time of the last CRT step is negligible compared to the two exp onentiations. Note that

p and q are half the length of N . Since simple implementations of multiplication take quadratic time,

multiplication mo dulo p is four times faster than mo dulo N . Furthermore, d is half the length

p

d d

p

of d and consequently computing M mo d p is eight times faster than computing M mo d N .

Overall signature time is thus reduced by a factor of four. Many implementations use this metho d

to improve p erformance.

Boneh, DeMillo, and Lipton [3 ] observed that there is an inherent danger in using the CRT

metho d. Supp ose that while generating a signature, a glitch on Bob's computer causes it to

miscalculate in a single instruction. Say, while copying a register from one lo cation to another, one

of the bits is ipp ed. A glitchmay b e caused byambient electromagnetic interference or p erhaps

by a rare hardware bug, like the one found in an early version of the Pentium chip. Given an

invalid signature, an adversary Marvin can easily factor Bob's mo dulus N .

We present a version of the attack as describ ed by A. K. Lenstra. Supp ose a single error

o ccurs while Bob is generating a signature. As a result, exactly one of C or C will b e computed

p q 13

b b b

C . Once incorrectly. Say C is correct, but C is not. The resulting signature is C = T C + T

q p q 1 p 2

e

b b

Marvin receives C , he knows it is a false signature since C 6= M mo d N . However, notice that

e e

b b

C = M mo d p while C 6= M mo d q:

e

b

As a result, gcdN; C M  exp oses a nontrivial factor of N .

For the attack to work, Marvin must have full knowledge of M . Namely, we are assuming

Bob do es not use any random padding pro cedure. Random padding prior to signing defeats the

attack. A simpler defense is for Bob to check the generated signature b efore sending it out to the

world. Checking is esp ecially imp ortant when using the CRT sp eedup metho d. Random faults are

hazardous to many cryptographic systems. Many systems, including a non-CRT implementation

of RSA, can b e attacked using random faults [3 ]. However, these results are far more theoretical.

5.3 Bleichenbacher's Attack on PKCS 1

Let N be an n-bit RSA mo dulus and M be an m bit message with m< n. Before applying RSA

encryption it is natural to pad the message M to n bits by app ending random bits to it. An old

version of a standard known as Public Key Cryptography Standard 1 PKCS 1 uses this approach.

After padding, the message lo oks as follows:

02 Random 00 M

The resulting message is n bits long and is directly encrypted using RSA. The initial blo ck con-

taining \02" is 16 bits long and is there to indicate that a random pad has b een added to the

message.

When a PKCS 1 message is received by Bob's machine, an application e.g., a web browser

decrypts it, checks the initial blo ck, and strips o the random pad. However, some applications

check for the \02" initial blo ck and if it is not present they send back an error message saying

\invalid ciphertext". Bleichenbacher [2 ] showed that this error message can lead to disastrous

consequences: using the error message, an attacker Marvin can decrypt ciphertexts of his choice.

Supp ose Marvin intercepts a ciphertext C intended for Bob and wants to decrypt it. To mount

 0 0

the attack, Marvin picks a random r 2 Z , computes C = rC mo d N , and sends C to Bob's

N

0

machine. An application running on Bob's machine receives C and attempts to decrypt it. It

0

either resp onds with an error message or do es not resp ond at all if C happ ens to be prop erly

0

formatted. Hence, Marvin learns whether the most signi cant 16 bits of the decryption of C are

equal to 02. In e ect, Marvin has an oracle that tests for him whether the 16 most signi cant bits

of the decryption of rC mo d N are equal to 02, for any r of his choice. Bleichenbacher showed that

such an oracle is sucient for decrypting C .

6 Conclusions

Two decades of researchinto inverting the RSA function pro duced some insightful attacks, but no

devastating attack has ever b een found. The attacks discovered so far mainly illustrate the pitfalls

to be avoided when implementing RSA. At the moment it app ears that prop er implementations

can b e trusted to provide security in the digital world. 14

We categorized attacks on RSA into four categories: 1 elementary attacks that exploit blatant

misuse of the system, 2 low private exp onent attacks serious enough that a low private exp onent

should never be used, 3 low public exp onent attacks, 4 and attacks on the implementation.

These last attacks illustrate that a study of the underlying mathematical structure is insucient.

Desmedt and Odlyzko [10 ], Joye and Quisquater [15 ] and deJonge and Chaum [9 ] describ e some

additional attacks. Throughout the pap er we observed that many attacks can be defeated by

prop erly padding the message prior to encryption or signing.

Acknowledgments

I thank Susan Landau for encouraging me to write the survey and Tony Knapp for his help in

editing the manuscript. I am also grateful to Mihir Bellare, Igor Shparlinski, and R. Venkatesan

for comments on an earlier draft.

References

[1] M. Bellare and P. Rogaway. Optimal asymmetric encryption. In EUROCRYPT '94, volume

950 of Lecture Notes in , pages 92{111. Springer-Verlag, 1994.

[2] D. Bleichenbacher. Chosen ciphertext attacks against proto cols based on the RSA encryption

standard PKCS 1. In CRYPTO '98, volume 1462 of Lecture Notes in Computer Science,

pages 1{12. Springer-Verlag, 1998.

[3] D. Boneh, R. DeMillo, and R. Lipton. On the imp ortance of checking cryptographic proto cols

for faults. In EUROCRYPT '97, volume 1233 of Lecture Notes in Computer Science, pages

37{51. Springer-Verlag, 1997.

[4] D. Boneh and G. Durfee. New results on cryptanalysis of low private exp onent RSA. Preprint,

1998.

[5] D. Boneh, G. Durfee, and Y. Frankel. An attack on RSA given a fraction of the private

key bits. In AsiaCrypt '98, volume 1514 of Lecture Notes in Computer Science, pages 25{34.

Springer-Verlag, 1998.

[6] D. Boneh and R. Venkatesan. Breaking RSA may not b e equivalentto factoring. In EURO-

CRYPT '98,volume 1403 of Lecture Notes in Computer Science, pages 59{71. Springer-Verlag,

1998.

[7] D. Copp ersmith. Small solutions to p olynomial equations, and low exp onent RSA vulnerabil-

ities. Journal of Cryptology, 10:233{260, 1997.

[8] D. Copp ersmith, M. Franklin, J. Patarin, and M. Reiter. Low-exp onent RSA with related

messages. In EUROCRYPT '96, volume 1070 of Lecture Notes in Computer Science, pages

1{9. Springer-Verlag, 1996.

[9] W. de Jonge and D. Chaum. Attacks on some RSA signatures. In Crypto '85, volume 218 of

Lecture Notes in Computer Science, pages 18{27. Springer-Verlag, 1986. 15

[10] Y. Desmedt and A. Odlyzko. Achosen text attack on the cryptosystem and some discrete

logarithm schemes. In CRYPTO '85, Lecture Notes in Computer Science, pages 516{522.

Springer-Verlag, 1985.

[11] S. Goldwasser. The search for provably secure cryptosystems. In Cryptology and computational

, volume 42 of Proceedings of the 42nd Symposium in Applied Mathematics.

American Mathematical So ciety, 1990.

[12] G. H. Hardy and E. M. Wright. An Introduction to the Theory of Numbers. Oxford Clarendon

Press, 1975. fourth edition.

[13] J. Hastad. Solving simultaneous mo dular equations of low degree. SIAM J. of Computing,

17:336{341, 1988.

[14] N. Howgrave-Graham. Finding small ro ots of univariate mo dular equations revisited. In

Cryptography and Coding,volume 1355 of Lecture Notes in Computer Science, pages 131{142.

Springer-Verlag, 1997.

[15] M. Joye and J.-J. Quisquater. On the imp ortance of securing your bins: The garbage-man-

in-the-middle attack. In 4th ACM Conference on Computer and Communications Security,

pages 135{141. ACM Press, 1997.

[16] P. Ko cher. Timing attacks on implementations of Die-Hellman, RSA, DSS, and other sys-

tems. In CRYPTO '96, volume 1109 of Lecture Notes in Computer Science, pages 104{113.

Springer-Verlag, 1996.

[17] L. Lov asz. An Algorithmic Theory of Number, Graphs and Convexity. SIAM Publications,

1986.

[18] A. K. Lenstra and H. W. Lenstra, Jr. Algorithms in numb er theory. In Handbook of Theoret-

ical Computer Science Volume A: Algorithms and Complexity, chapter 12, pages 673{715.

Elsevier and MIT Press, 1990.

[19] A. Menezes, P. van Oorschot, and S. Vanstone. Handbook of Applied Cryptography. CRC,

1996.

[20] C. Pomerance. A tale of two sieves. Notices Amer. Math. Soc., 43:1473{1485, 1996.

[21] R. L. Rivest, A. Shamir, and L. Adleman. A metho d for obtaining digital signatures and

public key cryptosystems. Commun. of the ACM, 21:120{126, 1978.

[22] M. Wiener. Cryptanalysis of short RSA secret exp onents. IEEE Transactions on Information

Theory, 36:553{558, 1990. 16