<<

Timing Attacks on Implementations of

Die-Hellman, RSA, DSS, and Other Systems

Paul C. Ko cher

Cryptography Research, Inc.

607 Market Street, 5th Flo or, San Francisco, CA 94105, USA.

E-mail: paul@.com.

Abstract. By carefully measuring the amount of time required to p er-

form private op erations, attackers may b e able to nd xed Die-

Hellman exp onents, factor RSA keys, and break other .

Against a vulnerable system, the attack is computationally inexp ensive

and often requires only known . Actual systems are p otentially

at risk, including cryptographic tokens, network-based cryptosystems,

and other applications where attackers can make reasonably accurate

timing measurements. Techniques for preventing the attack for RSA and

Die-Hellman are presented. Some cryptosystems will need to be re-

vised to protect against the attack, and new proto cols and algorithms

may need to incorp orate measures to prevent timing attacks.

Keywords: timing attack, , RSA, Die-Hellman, DSS.

1 Intro duction

Cryptosystems often take slightly di erent amounts of time to pro cess di erent

inputs. Reasons include p erformance optimizations to bypass unnecessary op-

erations, branching and conditional statements, RAM cache hits, pro cessor in-

structions (suchasmultiplication and division) that run in non- xed time, and

awidevariety of other causes. Performance characteristics typically dep end on

b oth the key and the input data (e.g., plaintext or ciphertext). While

it is known that timing channels can leak data or keys across a controlled p erime-

ter, intuition might suggest that unintentional timing characteristics would only

reveal a small amount of information from a (such as the Ham-

ming weightofthekey). However, attacks are presented which can exploit timing

measurements from vulnerable systems to nd the entire secret key.

2 Cryptanalysis of a Simple Mo dular Exp onentiator

Die-Hellman[2] and RSA[8] private-key op erations consist of computing R =

x

y mo d n, where n is public and y can be found byan eavesdropp er. The at-

tacker's goal is to nd x, the secret key.For the attack, the victim must com-

x

pute y mo d n for several values of y ,wherey , n, and the computation time are

known to the attacker. (If a new secret exp onent x is chosen for each op eration,

the attackdoesnotwork.) The necessary information and timing measurements

might b e obtained by passively eavesdropping on an interactive proto col, since

an attacker could record the messages received by the target and measure the

amount of time taken to resp ond to each y . The attack assumes that the attacker

knows the design of the target system, although in practice this could probably

b e inferred from timing information.

The attack can b e tailored to work with virtually any implementation that

do es not run in xed time, but is rst outlined using the simple mo dular exp o-

x

nentiation algorithm b elowwhich computes R = y mo d n, where x is w bits

long:

Let s =1.

0

For k =0 upto w 1:

If (bit k of x) is 1 then

Let R =(s  y )mod n.

k k

Else

Let R = s .

k k

2

Let s = R mo d n.

k +1

k

EndFor.

Return (R ).

w 1

The attackallows someone who knows exp onentbits0::(b-1) to nd bit b.To

obtain the entire exp onent, start with b equal to 0 and rep eat the attackuntil

the entire exp onentisknown.

Because the rst b exp onent bits are known, the attacker can compute the

rst b iterations of the For lo op to nd the value of s . The next iteration requires

b

the rst unknown exp onent bit. If this bit is set, R = (s  y )modn will be

b b

computed. If it is zero, the op eration will b e skipp ed.

The attack will be describ ed rst in an extreme hyp othetical case. Sup-

p ose the target system uses a mo dular multiplication function that is nor-

mally extremely fast but o ccasionally takes much more time than an entire

normal mo dular exp onentiation. For a few s and y values the calculation of

b

R =(s  y )mod n will b e extremely slow, and by using knowledge ab out the

b b

target system's design the attacker can determine which these are. If the total

mo dular exp onentiation time is ever fast when R =(s  y )modn is slow, exp o-

b b

nentbit b must b e zero. Conversely, if slow R =(s  y )modn op erations always

b b

result in slow total mo dular exp onentiation times, the exp onent bit is probably

set. Once exp onentbitb is known, the attacker can verify that the overall op er-

2

ationtimeisslow whenever s = R mo d n is exp ected to b e slow. The same

b+1

b

set of timing measurements can then b e reused to nd the following exp onent

bits.

3 Error Correction

If exp onent bit b is guessed incorrectly, the values computed for R will be

k b

incorrect and, so far as the attack is concerned, essentially random. The time 2

required for multiplies following the error will not be re ected in the overall

exp onentiation time. The attackthus has an error-detection prop erty; after an

incorrect exp onent bit guess, no more meaningful correlations are observed.

The error detection prop erty can b e used for error correction. For example,

the attacker can maintain a list of the most likely exp onentintermediates along

with a value corresp onding to the probability each is correct. The attack is

continued for only the most likely candidate. If the currently-favored value is

incorrect, it will tend to fall in ranking, while correct values will tend to rise.

Error correction techniques increase the memory and pro cessing requirements

for the attack, but can greatly reduce the numb er of samples required.

4 The General Attack

The attack can b e treated as a signal detection problem. The \signal" consists

of the timing variation due to the target exp onent bit, and \noise" results from

measurement inaccuracies and timing variations due to unknown exp onent bits.

The prop erties of the signal and noise determine the numb er of timing measure-

ments required to for the attack.

Given j messages y ;y ;:::;y with corresp onding timing measurements

0 1 j 1

T ;T ; :::; T , the probability that a guess x for the rst b exp onent bits is

0 1 j 1 b

correct is prop ortional to

j 1

Y

P (x ) / F (T t(y ;x ))

b i i b

i=0

where t(y ;x ) is the amount of time required for the rst b iterations of the

i b

x

y mo d n computation using exp onentbitsx , and F is the exp ected probability

b

i

distribution function of T t(y; x )over all y values and correct x . Because F

b b

is de ned as the probability distribution of T t(y ;x )if x is correct, it is the

i i b b

best function for predicting T t(y ;x ). Note that the timing measurements

i i b

and intermediate s values can b e used improve the estimate of F .

Given a correct guess for x , there are two p ossible values for x . The

b1 b

0

probabilitythat x is correct and x is incorrect can b e found as

b

b

Q

j 1

F (T t(y ;x ))

i i b

i=0

:

Q Q

j 1 j 1

0

F (T t(y ;x )) + F (T t(y ;x ))

i i b i i

i=0 i=0 b

In practice, this formula is not very useful b ecause nding F would require

extraordinary e ort.

5 Simplifying the Attack

Fortunately it is generally not necessary to compute F . Each timing observation

P

w 1

consists of T = e + t , where t is the time required for the multiplication

i i

i=0

and squaring steps for bit i and e includes measurement error, lo op overhead, 3

P

b1

etc. Given guess x , the attacker can nd t for each sample y . If x is

b i b

i=0

P P P

w 1 b1 w 1

t . Since t = e + t correct, subtracting from T yields e +

i i i

i=b i=0 i=0

the mo dular multiplication times are e ectively indep endent from each other

P

w 1

t over all observed and from the measurement error, the variance of e +

i

i=b

samples is exp ected to b e Var(e)+(w b)Var(t). However if only the rst c

bits of the exp onent guess are correct, the exp ected variance will be Var(e)+

(w b +2c)Var(t). Correctly-emulated iterations decrease the exp ected variance

byVar(t), while iterations following an incorrect exp onent bit each increase the

variance byVar(t). Computing the variances is easy and provides a go o d wayto

identify correct exp onent bit guesses.

It is now p ossible to estimate the numb er of samples required for the attack.

Supp ose an attacker has j accurate timing measurements and has two guesses

for the rst b bits of a w -bit exp onent, one correct and the other incorrect with

the rst error at bit c.For each guess the timing measurements can b e adjusted

P

b1

by t . The correct guess will b e identi ed successfully if its adjusted values

i

i=0

have the smaller variance.

It is p ossible to approximate t using indep endent standard normal variables.

i

If Var(e) is negligible, the exp ected probability of a correct guess is

!

j 1 j 1

   

X X

2 2 p

p p

P > w bX + 2(b c)Y w bX

i i i

i=0 i=0

!

j 1 j 1

X X

p

2

2(b c)(w b) X Y +2(b c) Y > 0 = P 2

i i

i

i=0 i=0

where X and Y are normal random variables with  =0 and  = 1. Because j

P P

j 1 j 1

2

is relatively large, Y  j and X Y is approximately normal with

i i

i

i=0 i=0

p

j , yielding  = 0 and  =

!

p

 

p

p

j (b c)

p

P 2 2(b c)(w b)( jZ )+2(b c)j>0 = P Z>

2(w b)

where Z is a standard normal random variable. Finally,integrating to nd the

q

 

j (bc)

, where (x) is the area under probability of a correct guess yields 

2(w b)

the standard normal curvefrom1 to x. The required numb er of samples (j )is

thus prop ortional to the exp onent size (w ). The numb er of measurements might

b e reduced if attackers cho ose inputs known to have extreme timing character-

istics at exp onent lo cations of interest.

6 Exp erimental Results

6

Figure 1 shows the distribution of 10 mo dular multiplication times observed

TM

using the RSAREF to olkit[10] on a 120-MHz Pentium computer running

TM

MSDOS . The distribution was prepared by timing one million (a  b mo d n)

calculations using a and b values from actual mo dular exp onentiation op erations 4

#

with random inputs. The 512-bit sample prime 1 from the RSAREF Die-

Hellman demonstration program was used for n. A few wildly ab errant samples

(whichtookover 1300s) were discarded. The Figure 1 distribution has mean 

= 1167.8s and standard deviation  =12:01s. The measurement error is small;

the tests were run twice and the average measurement di erence was found to

be under 1s. RSAREF uses the same function for squaring and multiplication,

so squaring and multiplication times haveidentical distributions.

2 3

RSAREF precomputes y and y mo d n and pro cesses two exp onent bits

at a time. In total, a 512-bit mo dular exp onentiation with a random 256-bit

exp onent requires 128 iterations of the mo dular exp onentiation lo op and a total

of ab out 352 mo dular multiplication and squaring op erations. Each iteration

of the mo dular exp onentiation lo op do es two squaring op erations and, if either

exp onent bit is nonzero, one multiply. The attack can be adjusted to app end

pairs of exp onent bits and to evaluate four candidate values at each exp onent

p osition instead of two.

Since mo dular multiplications consume most of the total mo dular exp onen-

tiation time, it is exp ected that the distribution of mo dular exp onentiation

times will b e approximately normal with   (1167:8)(352) = 411; 065:6s and

p

  12:01 352 = 225:3s. Figure 2 shows measurements from 5000 actual mo d-

ular exp onentiation op erations using the same computer and mo dulus, which

yielded  = 419; 901s and  =235s.

With 250 timing measurements, the probability that subtracting the time for

a correct mo dexp lo op iteration from each sample will reduce the total variance

q

 

j (bc)

, more than subtracting an incorrect iteration is estimated to be 

2(w b)

where j = 250, b = 1, c = 0, and w = 127. (There are 128 iterations of the

RSAREF mo dexp lo op for a 256-bit exp onent, but the rst iteration is ignored.)

q

 

250(10)

 0:84. The Correct guesses are thus exp ected with probability 

2(126) 5

5000 samples from Figure 2 were divided into 20 groups of 250 samples each,

and variances from subtracting the time for incorrect and correct mo dexp lo op

iterations were compared at each of the 127 exp onent bit pairs. Of the 2450

trials, 2168 pro duced a larger variance after subtracting an incorrect mo dexp

lo op time than after subtracting the time for a correct mo dexp lo op, yielding a

probability of 0.885. The rst exp onent bits are most dicult, since b b ecomes

larger as more exp onent bits b ecome known and the probabilities should improve.

(The test ab ove did not take advantageofthisproperty.) It is imp ortanttonote

that accurate timing measurements were used; measurement errors which are

large relative to the total mo dular exp onentiation time standard deviation will

increase the numb er of samples needed.

The attack is computationally quite easy. With RSAREF, the attacker has

to evaluate four choices p er pair of bits. Thus the attacker only has to do four

times the numb er of op erations done by the victim, not counting e ort wasted

by incorrect guesses.

7 Montgomery Multiplication and the CRT

Mo dular reduction steps usually cause most of the timing variation in a mo du-

lar multiplication op eration. Montgomery multiplication[6] eliminates the mo d n

reduction steps and, as a result, tends to reduce the size of the timing character-

istics. However, some variation usually remains. If the remaining \signal" is not

P

w 1

dwarfed by measurement errors, the variance in t and the variance of t

b i

i=b+1

would b e reduced prop ortionally and the attackwould still work. However if the

measurement error e is large, the required number of samples will increase in

1

. prop ortion to

Var(t )

i

The Chinese Remainder Theorem (CRT) is also often used to optimize RSA

private key op erations. With CRT, (y mo d p)and(y mo d q ) are computed rst,

where y is the message. These initial mo dular reduction steps can b e vulnerable

to timing attacks. The simplest such attack is to cho ose values of y that are

close to p or q , then use timing measurements to determine whether the guessed

value is larger or smaller than the actual value of p (or q ). If y is less than

p, computing y mo d p has no e ect, while if y is larger than p,itis necessary

to subtract p from y at least once. Also, if the message is very slightly larger

than p, y mo d p will have leading zero digits, whichmay reduce the amountof

time required for the rst multiplication step. The sp eci c timing characteristics

dep end on the implementation. RSAREF's mo dular reduction function with a

512-bit mo dulus the Pentium computer with y chosen randomly b etween 0 and

2p takes an average of 42.1sif y p. Timing

measurements from many y could b e combined to successively approximate p.

In some cases it may b e p ossible to improve the Chinese Remainder Theorem

RSA attack to use known (not chosen) , reducing the numb er of mes-

sages required and making it p ossible to attack RSA digital signatures. Mo dular

reduction is done by subtracting multiples of the mo dulus, and exploitable timing

variations can b e caused byvariations in the numb er of compare-and-subtract 6

steps. For example, RSAREF's division lo op integer-divides the upp ermost two

digits of y by one more than the upp er digit of p,multiplies p by the quotient,

shifts left the appropriate numb er of digits, then subtracts the result from y .If

the result is larger than p (shifted left), a extra subtraction is p erformed. The

decision whether to perform an extra subtraction step in the rst lo op of the

division algorithm usually dep ends only on y (which is known) and the upp er

two digits of p. A timing attack could be used to determine the upp er digits

of p. For example, an exhaustive search over all p ossible values for the upp er

two digits of p (or more ecient techniques) could identify value for whichthe

observed times correlate most closely with the exp ected numb er of subtraction

op erations. As with the Die-Hellman/non-CRTattack, once one digit of p has

b een found, the timing measurements could b e reused to nd subsequent digits.

It is not yet known whether timing attacks can b e adapted to directly attack

the mo d p and mo d q mo dular exp onentiations p erformed with the Chinese

Remainder Theorem.

8 Timing Cryptanalysis of DSS

1

The Standard[5] computes s = (k (H (m)+x  r )) mo d q ,

1

where r and q are known to attackers, k is usually precomputed, H (m)isthe

hash of the message, and x is the private key. In practice, (H (m)+x  r )modq

1

would normally b e computed rst, then is multiplied by k (mo d q ).

If the mo dular reduction function runs in non- xed time, the overall signa-

ture time should b e correlated with the time for the (x  r mo d q ) computation.

The attacker can calculate and comp ensate for the time required to compute

H (m). Since H (m)isofapproximately the same size as q , its addition has little

e ect on the reduction time. The most signi cant bits of x  r are typically the

rst used in the mo dular reduction. These dep end on r , which is known, and

the most signi cantbits of the secret value x. There would thus be a correla-

tion between values of the upp er bits of x and the total time for the mo dular

reduction. By lo oking for the strongest probabilities over the samples, the at-

tacker would try to identify the upp er bits of x. As more upp er bits of x become

known, more of x  r b ecomes known, allowing the attacker to pro ceed through

1

more iterations of the mo dular reduction lo op to attack new bits of x.Ifk is

precomputated, DSS signatures require just two mo dular multiplication op era-

tions, p otentially making the amount of additional timing noise whichmust b e

ltered out relatively small.

9 Masking Timing Characteristics

The most obvious waytoprevent timing attacks is to make all op erations take

exactly the same amount of time. Unfortunately this is often dicult. Making

software run in xed time, esp ecially in a platform-indep endent manner, is hard

b ecause compiler optimizations, RAM cache hits, instruction timings, and other 7

factors can intro duce unexp ected timing variations. If a timer is used to delay

returning results until a pre-sp eci ed time, factors suchasthe system resp on-

siveness or p ower consumption may still change detectably when the op eration

nishes. Some op erating systems also reveal pro cesses' CPU usage. Fixed time

implementations are also likely to be slow; many p erformance optimizations

cannot b e used since all op erations must take as long as the slowest op eration.

(Note: Always p erforming the optional R =(s  y )modn step do es not make

i i

an implementation run in constant time, since timing characteristics from the

squaring op eration and subsequent lo op iterations can b e exploited.)

Another approach is to make timing measurements so inaccurate that the

attack b ecomes unfeasible. Random delays added to the pro cessing time do in-

crease the numb er of ciphertexts required, but attackers can comp ensate by col-

lecting more measurements. The numb er of samples required increases roughly

as the of the timing noise. For example, if a mo dular exp onentiator whose

timing characteristics have a standard deviation of 10 ms can b e broken success-

fully with 1000 timing measurements, adding a random normally distributed

delay with 1 second standard deviation will make the attack require approxi-



2

1000ms

7

mately (1000) = 10 samples. (Note: The mean delaywould haveto

10ms

7

be several seconds to get a standard deviation of 1 second.) While 10 samples

7

is probably more than most attackers can gather, a security factor of 10 is not

usually considered adequate.

10 Preventing the Attack

Fortunately there is a b etter solution. Techniques used for blinding signatures[1]

can b e adapted to preventattackers from knowing the input to the mo dular ex-

p onentiation function. Before computing the mo dular exp onentiation op eration,

1

x

cho ose a random pair (v ;v ) such that v = v mo d n. For Die-Hellman,

i f i

f

1

x

it is simplest to cho ose a random v then compute v = (v ) mo d n. For

i f

i

RSA it is faster to cho ose a random v relatively prime to n then compute

f

1

e

v =(v ) mo d n, where e is the public exp onent. Before the mo dular exp o-

i

f

nentiation op eration, the input message should b e multiplied by v (mo d n), and

i

afterward the result is corrected by multiplying with v (mo d n). The system

f

should reject messages equal to 0 (mo d n).

Computing inverses mo d n is slow, so it is often not practical to generate a

1

x

new random (v ;v ) pair for each new exp onentiation. The v =(v ) mo d n

i f f

i

calculation itself mighteven b e sub ject to timing attacks. However (v ;v ) pairs

i f

should not be reused, since they themselves might be compromised by timing

attacks, leaving the secret exp onent vulnerable. An ecient solution to this

problem is up date v and v b efore each mo dular exp onentiation step by com-

i f

0 2 0 2

puting v = v and v = v . The total p erformance cost is small (2 mo dular

i i

f f

squarings, which can be precomputed, plus 2 mo dular multiplications). More

sophisticated up date op erations using exp onents other than 2, multiplication

with other (v ;v ) pairs, etc. can also b e used, but do not app ear to o er any

i f

advantages. 8

If (v ;v ) is secret, attackers have no useful knowledge ab out the input to

i f

the mo dular exp onentiator. Consequently the most an attacker can learn is the

general timing distribution for exp onentiation op erations. In practice, distribu-

w

tions are close to normal and the 2 exp onents cannot p ossibly b e distinguished.

However, a maliciously-designed mo dular exp onentiator could theoretically have

a distribution with sharp spikes corresp onding to exp onent bits, so blinding do es

not provably prevent timing attacks.

Even with blinding, the distribution will reveal the average time per op-

eration, which can be used to infer the Hamming weight of the exp onent. If

anonymity is imp ortant or if further masking is required, a random multiple of

'(n) can b e added to the exp onent b efore each mo dular exp onentiation. If this is

done, care must b e taken to ensure that the addition pro cess itself do es not have

timing characteristics whichreveal '(n). This technique may b e helpful in pre-

venting attacks that gain information leaked during the mo dular exp onentiation

op eration due to electromagnetic radiation, system p erformance uctuations,

changes in power consumption, etc. since the exp onent bits change with each

op eration.

11 Further Work

Timing attacks can potentially be used against other cryptosystems, includ-

ing symmetric functions. For example, in software the 28-bit C and D values

in the DES[4] are often rotated using a conditional which tests

whether a one-bit must be wrapp ed around. The additional time required to

move nonzero bits could slightly degrade the cipher's throughput or key setup

time. The cipher's p erformance can thus reveal the Hamming weight of the key,



 

56

P

56

56

n

2



log whichprovides an average of  3:95 bits of key infor-

56

56

2

n=0

2

n

16

mation. IDEA[3] uses an f () function with a mo dulo (2 +1) multiplication

op eration, which will usually run in non-constant time. RC5[7] is at risk on

platforms where rotates run in non-constant time. RAM cache hits can pro duce

timing characteristics in implementations of Blow sh[11], SEAL[9], DES, and

other ciphers if tables in memory are not used identically in every encryption.

Additional research is needed to determine whether sp eci c implementations

are at risk and, if so, the degree of their vulnerability. So far, only a few sp eci c

systems have b een studied in detail and the attacks against CRT/Montgomery

RSA and DSS are currently theoretical.

Further re nements to the attack may also be p ossible. A direct attack

against p and q in RSA with the Chinese Remainder Theorem would b e partic-

ularly imp ortant.

12 Conclusions

In general, anychannel which can carry information from a secure area to the

outside should be studied as a p otential risk. Implementation-sp eci c timing 9

characteristics provide one such channel and can sometimes be used to com-

promise secret keys. Vulnerable algorithms, proto cols, and systems need to b e

revised to incorp orate measures to resist timing cryptanalysis and related at-

tacks.

13 Acknowledgements

I am grateful to Matt Blaze, Joan Feigenbaum, Martin Hellman, Phil Karn,

Ron Rivest, and Bruce Schneier for their encouragement, helpful comments, and

suggestions for improving the manuscript.

References

1. D. Chaum, \Blind Signatures for Untraceable Payments," Advances in Cryptology:

Proceedings of Crypto 82, Plenum Press, 1983, pp. 199-203.

2. W. Die and M.E. Hellman, \New Directions in Cryptography," IEEE Transac-

tions on Information Theory, IT-22, n. 6, Nov 1976, pp. 644-654.

3. X. Lai, On the Design and Security of Block Ciphers, ETH Series in Information

Pro cessing, v. 1, Konstanz: Hartung-Gorre Verlag, 1992.

4. National Bureau of Standards, \," Federal Information

Pro cessing Standards Publication 46, January 1977.

5. National Institute of Standards and Technology, \Digital Signature Standard,"

Federal Information Pro cessing Standards Publication 186, May 1994.

6. P.L. Montgomery, \Mo dular Multiplication without Trial Division," Mathematics

of Computation, v. 44, n. 170, 1985, pp. 519-521.

7. R.L. Rivest, \The RC5 Encryption Algorithm," Fast Software Encryption: Second

International Workshop, Leuven, Belgium, December 1994, Proceedings, Springer-

Verlag, 1994, pp. 86-96.

8. R.L. Rivest, A. Shamir, and L.M. Adleman, \A metho d for obtaining digital sig-

natures and public-key cryptosystems," Communications of the ACM, 21, 1978,

pp. 120-126.

9. P.R. Rogaway and D. Copp ersmith, \A Software-Optimized Encryption Algo-

rithm," Fast Software Encryption: Cambridge Security Workshop, Cambridge,

U.K., December 1993, Proceedings, Springer-Verlag, 1993, pp. 56-63.

10. RSA Lab oratories, \RSAREF: A Cryptographic To olkit," Version 2.0, 1994, avail-

able via FTP from .com.

11. B. Schneier, \Description of a New Variable-Length Key, 64-bit Blo ck Cipher

(Blow sh)," Fast Software Encryption: Second International Workshop, Leuven,

Belgium, December 1994, Proceedings, Springer-Verlag, 1994, pp. 191-204. 10