<<

Strength of Two Data Standard

Implementations under Timing Attacks

Alejandro Hevia

Dept de Ciencias de la Computacion U Chile

and

Marcos Kiwi

Dept de Ingeniera Matematica U Chile

We study the vulnerability of two implementations of the DES cryp

tosystem under a timing attack A timing attack is a metho d designed to break cryptographic

systems that was recently prop osed by Paul Ko cher It exploits the engineering asp ects involved

in the implementation of and might succeed even against cryptosystems that re

main imp ervious to sophisticated cryptanalytic techniques A timing attack is essentially a way

of obtaining some users private information by carefully measuring the time it takes the user to

carry out cryptographic op erations

In this work we analyze two implementations of DES We show that a timing attack yields

the Hamming weight of the used by b oth DES implementations Moreover the attack is

computationally inexp ensive We also show that all the design characteristics of the target system

necessary to carry out the timing attack can b e inferred from timing measurements

Categories and Sub ject Descriptors E Data Data Encryptioncode breakingstandards C

Computer Systems Organization Sp ecialPurpose and ApplicationBased SystemsSmart

cards

General Terms Security

Additional Key Words and Phrases Timing attack data encryption standard

A preliminary version of this pap er app eared in the Pro ceedings of the rd Latin American

Symp osium on Theoretical Informatics Campinas Brazil pages April

The research of A Hevia was partially supp orted by FONDAP in Applied Mathematics and

by FONDECYT No

The research of M Kiwi was partially supp orted by FONDECYT No Fundacion Andes

and FONDAP in Applied Mathematics

Name Alejandro Hevia

Address Dept de Cs de la Computacion Fac de Cs Fsicas y Matematicas Av Blanco Encal

ada Casilla Santiago Chile aheviadccuchilecl

Name Marcos Kiwi

Address Dept de Ing Matematica Fac de Cs Fsicas y Matematicas Casilla Santiago

Chile mkiwidimuchilecl

Permission to make digital or hard copies of part or all of this work for p ersonal or classro om use is

granted without fee provided that copies are not made or distributed for prot or direct commercial

advantage and that copies show this notice on the rst page or initial screen of a display along

with the full citation Copyrights for comp onents of this work owned by others than ACM must

b e honored Abstracting with credit is p ermitted To copy otherwise to republish to p ost on

servers to redistribute to lists or to use any comp onent of this work in other works requires prior

sp ecic p ermission andor a fee Permissions may b e requested from Publications Dept ACM

Inc Broadway New York NY USA fax or permissionsacmorg

 Alejandro Hevia and Marcos Kiwi

INTRODUCTION

An ingenious new typ e of cryptanalytic attack was intro duced by Ko cher in Ko cher

This new attack is called timing attack It exploits the fact that cryptosys

tems often take slightly dierent amounts of time on dierent inputs Ko cher gave

several p ossible explanations for this b ehavior among these branching and con

ditional statements RAM cache hits pro cessor instructions that run in nonxed

time etc Ko chers most signicant contribution was to show that running time

dierentials can b e exploited in order to nd some of a target systems private infor

mation Indeed in Ko cher it is shown how to cryptanalyze a simple mo dular

exp onentiator Mo dular exp onentiation is a key op eration in DieHellmans key

exchange proto col Die and Hellman and the RSA Rivest

et al A mo dular exp onentiator is a pro cedure that on inputs k n N

k

n and y Zcomputes y mo d n In the cryptographic proto cols mentioned

ab ove n is public and k is private Ko cher rep orts that if a passive eavesdropp er can

k

measure the time it takes a target system to compute y mo d n for several inputs

y then he can recover the secret exp onent k Moreover the overall computational

eort involved in the attack is prop ortional to the amount of work done by the vic

tim For concreteness sake and clarity of exp osition we now describ e the essence of

Ko chers metho d for recovering the secret exp onent of the xedexp onent mo dular

exp onentiator shown in Fig

Input y Z

Code z

Let k k b e k in binary

0

l

For i l down to do

2

z z mo d n

If k then z z y mo d n

i

Output z

Fig Mo dular exp onentiator

The attack allows someone who knows k k to recover k To obtain the

l t t

entire exp onent the attacker starts with t l and rep eats the attack until t

The attacker rst computes l t iterations of the for lo op The next iteration

requires the rst unknown bit k If the bit is set then the op eration z z y mo d

t

n is p erformed otherwise it is skipp ed Assume that each timing observation

P

l

corresp onds to an observation of a random variable T e T where T

li li

i

is the time required for the multiplication and squaring steps corresp onding to the

bit k and e is a random variable representing measurement error lo op overhead

li

etc An attacker that correctly guesses k may factor out of T the eect of

t

T T and obtain an adjusted random variable of known variance provided

l t

the times needed to p erform mo dular multiplications are indep endent from each

other and from the measurement error Incorrect guesses will pro duce an adjusted

random variable with a higher variance than the one exp ected Computing the

variance is easy provided the attacker collects enough timing measurements The

correct guess will b e identied successfully if its adjusted values have the smaller

variance

Strength of Two Data Encryption Standard Implementations under Timing Attacks 

In theory timing attacks can yield some of a target systems private information

In practice in order to successfull y mount a timing attack on a remote cryptosystem

a prohibitively large numb er of timing measurements may b e required in order to

comp ensate for the increased uncertainty caused by random network delays Nev

ertheless there are situations where we feel it is realistic to mount a timing attack

We now describ e one of them Challengeresp onse proto cols are used to establish

whether two entities involved in communication are indeed genuine entities and can

thus b e allowed to continue communication with each other In these proto cols one

entity challenges the other with a random numb er on which a predetermined cal

culation must b e p erformed often including a secret key In order to generate the

correct result for the computation the other device must p osses the correct secret

key and therefore can b e assumed to b e authentic Many smart cards in partic

ular dynamic password generators tokens and electronic wallet cards implement

challengeresp onse proto cols eg the message authentication co de generated ac

cording to the ANSI X Menezes et al page standard It is exp ected

that extensive use will b e made of smart cards based in general purp ose program

mable integrated circuit chips Thus the sp ecic functionality of each smart card

will b e achieved through programming The security of these smart cards will b e

provided using tamp erpro of technology and cryptographic techniques The ab ove

describ ed scenario is an ideal setting in which to carry out a timing attack The

widespread availability of a particular typ e of card will make it easy and inexp ensive

to determine the timing characteristics of the system on which to mount the attack

Later the obtaining of precise timing measurements eg by monitoring or altering

a card reader or by gaining p ossession of a card could b e used to retrieve some of

the secret information stored in the card by means of a timing attack Thus cards

that implement challengeresp onse proto cols where master keys are involved could

give rise to a security problem See Dhem et al for a discussion of a practi

cal implementation of a timing attack against an earlier version of the CASCADE

smart card

New unanticipated strains of timing attacks might arise Hence timing attacks

should b e given some serious consideration This work contributes ultimately in

furthering our understanding of the strengths of this new cryptanalytic technique

the weaknesses it exploits and the ways of eliminating the p ossibility of it b ecoming

practical

Ko cher implemented the attack against the DieHellman proto

col He also observed that timing attacks could p otentially b e used against other

cryptosystems in particular against the Data Encryption Standard DES This

claim is the motivation for this work

SUMMARY OF RESULTS AND ORGANIZATION

We study the vulnerability of one of the most widely used cryptosystems in the

world DES against a timing attack The starting p oint of this work is the ob

servation of Ko cher Ko cher that in DESs generation pro cess

moving nonzero bit C and D values using a conditional statement which tests

whether a onebit must b e wrapp ed around could b e a source of nonconstant en

cryption running times Hence he conjectured that a timing attack against DES

 Alejandro Hevia and Marcos Kiwi

could reveal the Hamming weight of the key We show that although Ko chers

observation is incorrect for the DES implementations that we analyzed his con

jecture is true But we do more

In Sect we give a brief description of DES

In Sect we describ e a timing attack against DES that assumes the attacker

knows the target systems design characteristics We rst discuss exp erimental

results that show that a computationally inexp ensive timing attack against two

implementations of DES could yield enough information to recover the Hamming

weight of the DES key b eing used Hence assuming the DES keys are randomly

chosen an attacker can recover approximately bits of key information To the

b est of our knowledge this is the rst implementation of a timing attack against a

symmetric cryptosystem Since the preliminary version of this work app eared two

timing attacks against RC have b een rep orted Handschuh and Heys In

Sect we describ e computational exp eriments that measure the threat implied

by an actual implementation of a timing attack against DES

Recovering bits of a DES key is a mo dest improvement over brute force

key search But recovering the Hamming weight of the key is p otentially more

threatening In particular an adversary can restrict attention to keys determined

to have either a signicantly low or high Hamming weight Although such keys may

b e rare once the adversary determines that one such key is b eing used the ensuing

key search may b e signicantly sp ed up Thus the adversary can balance the time

to nd such rare keys with the time needed for key recovery In some systems even

the recovery of a single although rare key may b e of serious concern

In Sect we identify the sources of the dep endencies b etween the encryption

time and the keys Hamming weight in the implementations of DES that we studied

The most relevant are conditional statements

In b oth DES implementations that we analyzed the encryption time T is roughly

equal to a linear function of the keys Hamming weight X plus some normally

distributed noise e Since a DES key is a bit long string and keys are chosen

uniformly at random in the key space we have that X B inom Thus

for some and

T X e X B inom e N or m

In Sect we show that it is not necessary in order to p erform a timing attack

against DES to assume that the design characteristics of the target system are

known Indeed we prop ose two statistical metho ds whereby a passive eavesdropp er

can infer from timing measurements all the target systems design information

required to successfully mount a timing attack against DES To the b est of our

knowledge this is the rst pro of that it is p ossible to infer a target systems design

characteristics through timing measurements

We would like to stress that all of the timing attacks describ ed in this work

only require precise measurements of encryption times but no knowledge of the

encrypted plaintexts or pro duced

1

Recall that the Hamming weight of a bitstring equals the numb er of its bits that are nonzero

2

Recall that the distribution B inom N p corresp onds to the distribution of the sum of N

indep endent identically distributed f grandom variables with exp ectation p

Strength of Two Data Encryption Standard Implementations under Timing Attacks 

In Sect we prop ose a blinding technique that can b e used to eliminate almost

all of the execution time dierentials in the analyzed DES implementations This

blinding technique makes b oth DES implementations that we study imp ervious to

the sort of timing attack we describ e in this work Finally we discuss under which

conditions all and not only the Hamming weight of a DES key might b e recovered

through a timing attack

Related Work

Mo dern cryptography advo cates the design of cryptosystems based on sound math

ematical principles Thus many of the cryptosystems designed over the last two

decades can b e proved to resist many sophisticated mathematically based cryptan

alytic techniques provided one is willing to accept some reasonable assumptions

Traditionally the techniques used to attack such cryptosystems exploit the algorith

mic design weaknesses of the cryptosystem On the other hand timing attacks take

advantage of the decisions made when implementing the cryptosystems sp ecially

those that pro duce nonxed running times But timing attacks are not the only

typ e of attacks that exploit the engineering asp ects involved in the implementation

of cryptosystems Indeed recently Boneh Lipton and DeMillo Boneh et al

intro duced the concept of fault tolerant attacks These attacks take advantage of

p ossibly induced hardware faults Boneh et al p oint out that their attacks show

the danger that hardware faults p ose to various cryptographic proto cols They con

clude that even sophisticated cryptographic schemes sealed inside tamp erresistant

devices might leak secret information

A new strain of fault tolerant attacks dierential fault analysis DFA was pro

p osed by Biham and Shamir Biham and Shamir Their attack is applicable

to almost any secret key cryptosystem prop osed so far in the op en literature DFA

works under various fault mo dels and uses cryptanalytic techniques to recover the

secret information stored in tamp erresistant devices In particular Biham and

Shamir show that under the same hardware fault mo del considered by Boneh et al

the full DES key can b e extracted from a sealed tamp erresistant DES encryptor

by analyzing b etween and ciphertexts generated from unknown but related

plaintexts Furthermore in Biham and Shamir techniques are develop ed to

identify the keys of completely unknown ciphers sealed in tamp erresistant devices

The new typ e of attacks describ ed ab ove have received widespread attention see

for example English and Hamilton Marko

THE DATA ENCRYPTION STANDARD

DES is the most widely used cryptosystem in the world sp ecially among nancial

institutions It was develop ed at IBM and adopted as a standard in NBS

It has b een reviewed every ve years since its adoption

DES has held up remarkably well against years of cryptanalysis But faster and

cheap er pro cessors allow using current technology to build a reasonably priced

sp ecial purp ose machine that can recover a DES key within hours Stinson

pp For concreteness sake we provide b elow a brief description of DES For

a detailed description see NBS More easily accessible descriptions of DES

can b e found in Schneier Stinson

DES is a symmetric or privatekey cryptosystem ie a cryptosystem where the

 Alejandro Hevia and Marcos Kiwi





Plaintext

- - - -





Encryption

Perm Perm

6 6 6

-

Key Key schedule

Fig DES encryption pro cess

parties that wish to use it must agree in advance on a common secret key which

must b e kept private DES encrypts a message plaintext bitstring of length

using a bitstring key of length and obtains a ciphertext bitstring of length

It has three main stages In the rst stage the bits of the plaintext are p ermuted

according to a xed initial p ermutation In the second stage iterations of a

certain function are successivel y applied to the bitstring resulting from the rst

stage In the nal stage the inverse of the initial p ermutation is applied to the

bitstring obtained in the second stage

The strength of DES resides on the function that is iterated during the encryption

pro cess We now give a brief description of this iteration pro cess The input to

iteration i is the output bitstring of iteration i and a bit long string K

i

Actually each K is a p ermuted selection of bits from the DES key The strings

i

K K comprise what is called the key schedule During each iteration a

bit long output string is computed by applying a xed rule to the two input strings

The encryption pro cess is depicted in Fig

Decryption is done with the same encryption algorithm but using the key schedule

in reverse order K K

The b est traditional cryptanalytic attacks known against DES are due to Biham

and Shamir Biham and Shamir Biham and Shamir and Matsui Matsui

a Matsui b However they are not considered a threat to DES in practical

environments see Menezes et al pp

TIMING ATTACK OF DES

We now consider the problem of recovering the Hamming weight of the DES key

of a target system by means of a timing attack We rst address the problem in

Sect assuming the attacker knows the design of the target system We then

show in Sect that this assumption can b e removed

Timing Characteristics of Two Implementations of DES

We studied the timing characteristics of two implementations of DES The rst

one was obtained from the RSAEuro cryptographic to olkit Kapp henceforth

referred to as RSADES The other implementation of DES that we lo oked at was

one due to Louko Louko henceforth referred to as LDES We studied b oth

TM TM

implementations on a MHz Pentium computer running MSDOS The

TM

advantage of working on an MSDOS environment is that it is a single pro cess

op erating system This facilitates carrying out timing measurements since there are

Strength of Two Data Encryption Standard Implementations under Timing Attacks 

550

500

450 Encryption

400

350

300 Time [microseconds] Key Schedule

250

200

150 0 10 20 30 40 50

Hamming weight of key

Fig RSADES

400

350

300

250

200

150 Encryption Time [microseconds]

100 Key Schedule

50

0 0 10 20 30 40 50

Hamming weight of key

Fig LDES

no other interfering pro cesses running and there are less op erating system mainte

nance tasks b eing p erformed We measured time in microseconds s

In our rst exp eriment we xed the input message to b e the bitstring of length

all of whose bits are set to For each i f g we randomly chose

keys of Hamming weight i For each selected key we encrypted the message a total

of times During each encryption we measured the time it to ok to generate the

key schedule and the total time it to ok to encrypt the message The plots for each

of the implementations that we lo oked at of the average for each key encryption

and key schedule generation times are shown in Fig and Fig

Only obvious outliers were eliminated In fact the only outliers that we noticed

app eared at xed intervals of clo ck ticks These outliers were caused by system

maintenance tasks

 Alejandro Hevia and Marcos Kiwi

A randomly chosen DES key has a Hamming weight b etween and with

probability approximately Thus the most relevant data p oints shown in

Fig and Fig are those close to the middle of the plots

For various keys chosen at random we p erformed time measurements for each

key of the encryption and key schedule generation times After discarding obvious

outliers we graphed the empirical frequency distributions of the collected data The

empirical distributions we observed were roughly symmetric and concentrated in a

few contiguous values usually three or four This concentration of values is due to

the fact that we were only able to p erform time measurements with an accuracy of

s and that time dierentials among p erformed under the same

key were rarely larger than s For an explanation of how to measure time with

TM

this precision on an MSDOS environment see App endix A The ab ove suggests

as one would exp ect that the variations on the running time observed when the

same pro cess is executed many times over the same input are due to the eect of

normally distributed random noise

For dierent values of i f g we randomly chose keys of Hamming

weight i After throwing away outliers we graphed the empirical frequency dis

tributions of the collected data The empirical frequencies observed lo oked like

normal distributions with small deviations typically s for LDES and s

for RSADES We conclude that the variations on the encryption and key schedule

generations times observed among keys of same Hamming weight are mostly due to

the total number of bits of the key that are set and not by the position where these

set bits o ccur Thus the eect of which bits are set among keys of same Hamming

weight is negligible

We rep eated all the exp eriments describ ed so far but instead of leaving the input

message xed we chose a new randomly selected message at the start of each en

cryption pro cess All the results rep orted ab ove remained essentially unchanged

There was only a negligible increase in the measured deviations

Assuming that the attacker knows the design of the target system he can build

on his own a table of the average encryption time versus the Hamming weight of

the key The clear monotonically increasing relation b etween the encryption time

and the Hamming weight of the key elicited by our exp eriments is a signicant

implementation aw It allows an attacker to determine the Hamming weight of the

DES key Indeed the attacker has to obtain a few encryption time measurements

and lo ok in the table he has built to determine the keys Hamming weight from

which such time measurements could have come Thus the attacker can recover

H wtK bits of key information H denotes the binary entropy function

Remark A precise estimation of the Hamming weight of the DES key can be

achieved by means of a timing attack if two situations hold First accurate time

measurements can be obtained Second the variations in the encryption and key

schedule generation time produced by dierent keys with identical Hamming weight

is smal l compared to the time variations produced by keys with one more or one less

set bit We have noticed that the latter situation approximately holds An exact

estimation of Hamming weight of the DES key can be achieved if the attacker can

accurately perform time measurements of several encryptions of the same plaintext

But this requires a more powerful attacker one that should be capable of xing the

Strength of Two Data Encryption Standard Implementations under Timing Attacks 

64 64

Input M f g C f g R

Where is the time it takes DES to generate ciphertext C from message M

Code For i up to

3

Let l b e such that jf j jT j jT j gj i

j

l

56

Let K f K f g wtK l g

l

Randomly cho ose m in f jK j g

l

For j up to jK j

l

Let K b e the m j mo d jK j lexicographically rst elem of K

l l

If DES encryption of M under key K yields C then returnK

Fig Key recovery pro cedure based on a timing attack that reveals the Hamming weight of

the key

input message fed into the encryption process

More remarkable than the established monotonically increasing relation b etween

the encryption times and the Hamming weight of the key is the linear dep endency

that exists b etween the two measured quantities The correlation factors for the

data shown in Fig and Fig are and resp ectivel y The sharp linear

dep endency b etween encryption times and Hamming weight allows an attacker to

infer the target systems information which is required to carry out the attack

describ ed ab ove This topic is discussed in the next section

Experimental Results In this section we describ e a computational exp er

iment that shows the exp ected reduction in the size of the key space search that

would b e achieved by the implementation of the timing attack describ ed in the

previous section

Assume that for every i f g we have a T R corresp onding to the

i

exp ected time it takes the target DES implementation to encrypt a message with

a key of Hamming weight i Furthermore assume that T T as supp orted

i i

by our exp erimental observations Consider the pro cedure of Fig for recovering

the DES encryption key through a timing attack that exploits the facts rep orted

in Sect Note that it is p ossible to exp erimentally determine the exp ected

numb er of keys that this pro cedure would try without having to actually execute it

Indeed if the DES encryption of plaintext M under key K generates the ciphertext

C in s then the exp ected size of the key space searched by the given pro cedure is

f K f g jT j jT j g

wtK wtK

f K f g T T g

wt K wt K

For b oth DES implementations we studied we randomly chose DES messagekey

pairs measured the encryption time and computed the exp ected numb er of keys

that the pro cedure of Fig would have tried b efore nding the correct encryption

key From our discussion of Sect it follows that the b est that one can hop e

for is to have to try half of the keys whose Hamming weight equals that of the

correct encryption key This corresp onds to p ercent of all the key space since

P

p log p

k k

k

We found that for RSADES then if p

k

k

3

In the unlikely event that l is not uniquely dened p erturb by a value uniformly chosen in

the interval where is tiny compared to the precision of the timing measurements

 Alejandro Hevia and Marcos Kiwi

k

p

k

RSA

Louko

k

p

k

RSA

Louko

Table Results of computational exp eriment

p ercent of the key space would have b een searched in average b efore nding the

correct encryption key For LDES the p ercentage go es down to p ercent

Table shows in more detail some of the data collected in our exp eriments

Columns are lab eled according to the weight of the DES key We denote the weight

of a key by k The second row represents the p ercentage of the total key space

corresp onding to DES keys of Hamming weight k with a precision of We

denote this value by p For each DES key of weight k we estimated p

k k

times the exp ected p ercentage of the key space that would have b een searched

b efore nding the encryption key Each of theses estimates was based on p

k

measurements in order to insure that at least measurements were considered for

every estimate asso ciated to nonzero p s The last two rows of Table show for

k

each DES implementation and some key weights the average of the values obtained

Recovering bits of a DES key gives a mo dest improvement in the time

needed to recover the key But Table implies that a timing attack that reveals

the Hamming weight of the key is p otentially more threatening In particular an

adversary can restrict attention to keys determined to have either a signicantly

low or high Hamming weight The adversary can do this by p erforming timing

measurements until one is found to b e either signicantly low or high Once the

adversary detects such a rare key the subsequent key search can b e much less than

the usual amount Thus the adversary can balance the time to nd such rare keys

with the time needed for key recovery In some systems the recovery of even a

single key may cause total disruption andor forward vulnerability

Sources of the dependency between DES encryption time and keys Ham

ming weight The key schedule generation in LDES is carried out by a pro cedure

set key This pro cedure computes the resulting key schedule bitstring called des

by p erforming a bitwise OR with some precomputed constants For each bit of the

key such bitwise ors are computed if and only if the key bit is set For that purp ose

it uses a piece of co de of the following form If condit then instr instr

else instr The numb er of times condit is true turns out to b e exactly the Ham

ming weight of the DES key This is the main source of running time dierentials

in LDES

In RSADESs key schedule generation co de there is also a pro cedure that con

tains two conditional statements These conditional statements are used in the

computation of the subkeys More precisely they implement a xed p ermutation

Strength of Two Data Encryption Standard Implementations under Timing Attacks 

PC of some bits of the key Their co de is of the following form If condit then

instr The total numb er of times condit is true is equal to the sum of the Ham

ming weight of all subkeys Thus the numb er of times instr is executed is directly

prop ortional to the Hamming weight of the DES key

As mentioned in Sect Ko cher Ko cher conjectured that in DESs key

schedule the rotation of nonzero bits using conditional statements could give rise

to running time dierentials In the implementations of DES we analyzed we found

no evidence to supp ort this conjecture

Finally note that it is clear from Fig that in LDES there is a source of non

xed running times which do es not dep end on the key schedule generation pro cess

This is evidenced by the nonconstant distance b etween the two curves shown in

Fig The source of these time dierentials is not due to conditional statements

We were not able to identify the cause of this dep endency nor able to exploit it in

order to recover all of the DES key

Derivation of the Timing Characteristic s of the Target System

As discussed in Sect in b oth DES implementations that we studied the en

cryption time was roughly equal to a linear function of the keys Hamming weight

plus some normally distributed random noise In this section we exploit this fact

in order to derive all the necessary information needed to p erform a timing attack

that reveals the Hamming weight of the target systems DES key

First we need to intro duce some notation Assume we have m measurements on

the time it takes the target system to p erform a DES encryption The time mea

surements might corresp ond to encryptions p erformed under dierent DES keys

For i f k g denote by K the ith key that is used by the target system dur

i

ing the p erio d that timing measurements are p erformed We make the realistic

assumption that K K are chosen at random in f g and indep endent of

k

i

each other Let X denote the Hamming weight of key K Thus the distribution

i

i

of X is a B inom Since we are assuming that the K s are chosen in

i

k

dep endently we have that X X are indep endent random variables Note

that successive time measurements can corresp ond to encryptions of the message

under the same key For i f k g let f m g b e the index of the

i

last measurement corresp onding to an encryption p erformed with key K For con

i

veniences sake let Hence m Denote by

k k

I the set of indices that corresp ond to time measurements under key K ie for

i i

def

i f k g let I f n N n g For i f k g and j I

i i i i

i

let T b e the random variable representing the time it takes the target system to

j

i

p erform the j th encryption of the message with key K Finally for j I let e

i i

j

b e a random variable representing the eect of random noise on the j th encryption

i

with key K Thus the e s represent measurement inaccuracies and the target

i

j

systems running time uctuations

We now have all the notation necessary to formally state the problem we want

to address Indeed the linear dep endency b etween the encryption time and the

Hamming weight of the key in b oth DES implementations that we studied implies

 Alejandro Hevia and Marcos Kiwi

that there exists and such that for all i f k g and j I

i

i i i

i i

T X e X B inom e N or m

j j j

Our problem is to infer from timing measurements the parameters and for

which holds We address two variations of this problem In Sect we show

how to with the case where the s are known In Sect we show how to

i

handle the case where the s are unknown The former case is the most realistic

i

one Indeed a standard cryptanalytic assumption is that the attacker knows the

key management pro cedure of the target system

Known s We prop ose two alternative statistical metho ds for deducing

i

the parameters and for which holds One metho d is based on maximum

likeliho o d estimators and the other one on asymptotically unbiased estimators

Since the following discussion heavily relies on standard concepts and results from

probability and statistics we refer the reader unfamiliar with these sub jects to Feller

Ross Zacks for background material and terminology

i

i k i

and Maximum Likelihood Estimators Let X X T T

j I

i

i

j

i k k

T T Thus X T T and T denote random variables Further

i

i

i k k i

and t t b e the actual values taken more let x x t t

i j I

i

i i j

i

by X T and T resp ectivel y

Let f t b e the marginal distribution of T given and For a

T

xed collection of time measurements t the values of and that maximize

f t are the maximum likeliho o d estimators we are lo oking for The maxi

T

mum likeliho o d estimators are the values most likely to have pro duced the observed

time measurements They can also b e regarded as the values minimizing the loss

function log f t This explains why maximum likeliho o d estimators are

T

thought to b e go o d predictors Thus in order to determine go o d estimators for

and we rst compute f t

T

Proposition The marginal distribution of T given and is

m

k

i h

P

Y

i

i

t X

j I j

i

e E f t

i

T

X

i

Proof Let f f and f denote the joint

XT X

T X x

density function of X and T the density function of T given X x and the

probability distribution of X resp ectivel y For conveniences sake we henceforth

omit and from the expressions for f f and f

XT X

T X x

i

i i

Observe that the indep endence of the X s and e s imply that the T s are

j

indep endent Thus the joint density function of X and T given and is

k

Y

i

t f x f f x t f x f t

i i i

i XT X

T X x

X T X x

i

i

i i

where the last equality follows since the X s are indep endent and the T s are

indep endent

Strength of Two Data Encryption Standard Implementations under Timing Attacks 

i

i

From we get that T given X x distributes like a N or m x

i i

j

i

Moreover for xed i the T s are indep endent random variables Hence

j

Y

i

t x

i

i

j

e f t

i i

T X x

i

j I

i

jI j

i

P

i

t x

i

j I j

i

e

i

Thus Since X B inom we know that f x

i

i

X

x

i

m

k

P

Y

i

t x

i

j I j

i

e f x t

XT

x

i

i

The marginal distribution of T given and equals the sum over all values

taken by x of f x t Hence

XT

m

k

P

Y X

i

t x

i

j I j

i

f t e

T

x

i

x

i

i

i

The conclusion follows directly from the previous equality and the fact that X

B inom

For a given t the values of and that maximize the right hand side of the

expression in Prop osition are the maximum likeliho o d estimators sought As is of

ten the case when dealing with maximum likeliho o d estimators it is dicult to solve

explicitly for them See Zacks Ch x for a discussion of computational

routines that can b e used to calculate maximum likeliho o d estimators

The advantage of the ab ove describ ed approach for determining the parameters

relevant for carrying out the timing attack is that it uses all the available timing

measurements But it do es not allow us to determine how many measurements are

sucient in order to obtain accurate estimations of the parameters sought The

alternative approach describ ed b elow solves this problem

b

Asymptotic Estimators Our goal is to nd go o d estimators b and b for

and Moreover we are interested in determining the asymptotic on the

numb er of timing measurements b ehavior of such estimators In particular their

asymptotic distributions their limiting values and their rate of convergence

We will now derive go o d predictors for and We start with a key observa

tion Since the exp ectation and variance of a B inom are and resp ec

tively taking the exp ectation and variance in yields that for all i f k g

and j I

i

i h i h

def def

i i

V T E T

T

T

j j

Hence if we knew and we could solve for and in This suggests

T

T

that if we can nd go o d estimators for and then we can derive go o d

T

T

c

c

estimators for and We now provide candidates for c and the esti

T

T

mators for and resp ectively But we rst need to intro duce additional

T T

 Alejandro Hevia and Marcos Kiwi

notation Let

k

X X X

i i

def def def

i i

i

T T T k e e jI j T jI j

i i

j j

i

j I j I

i i

Dene

k k

X X X X

i

def def def

i i

c

c

T T T c T T

T

j j

T

k jI j k jI j

i i

i i

j I j I

i i

b

Solving for and in yields that the two natural candidates for b and

the estimators for and are

def def

c

c

b

p

c b b

T

T

We now prove that b is well dened

k

X

i

c

c

Proposition T T

T

k

i

Proof Just note that

k k

X X X

i i i

i

c

c

T T T T T T

j

T

k jI j k

i

i i

j I

i

We henceforth denote a chi distribution with l degrees of freedom by

l

c

c

is Proposition If jI j jI j n then the distribution of

k

T

approximately

k

k n

i

i i

i

i i

T X e Proof Since T X e we have that

j j

i i

Since random variables e is the average of n indep endent N or m e

N or m n In addition the de MoivreLaplace Theorem Hazewinkel

pp states that the B inom m p distribution can b e expressed in terms of

the standard normal distribution Moreover if m then such an expression

is exact and if mp p then the expression provides a go o d approxima

i

tion of the Binomial distribution Ross pp Thus since X

i

B inom the distribution of X is well approximated by a N or m

i

i

e and the sum of indep endent normal distrib Hence since X is indep endent of

i

utions is a normal distribution it follows that T is approximately distributed as

a N or m The desired conclusion follows from a classical statistics

T

n

result Hogg and Tanis Theorem and Prop osition

p

Proposition If jI j jI j n and n is negligible then k b

k

converges in distribution to a N or m plus some smal l constant error term

when k

4

Recall that when X X X are random variables on some probability space F P it is

1 2

Strength of Two Data Encryption Standard Implementations under Timing Attacks 

Proof First note that if we neglect n then Prop osition and our denition

c

c

is approximately a of b imply that the distribution of b

k

T

k

Second recall that the sum of the squares of l indep endent identically distributed

normal random variables with zero mean and variance equal to is distributed

according to a Equivalently the sum of l indep endently distributed random

l

variables is distributed according to a Hence since the exp ectation and variance

l

of a random variable are and resp ectivel y the Central Limit Theorem implies

p

that converges in distribution to a N or m k

k

k

p

Putting the two observations together shows that k b converges in

distribution to a N or m plus some small constant term when k The

stated result follows immediately

Theorem If jI j jI j n n is negligible and k is suciently

k

large then the distribution of b is approximately a N or m

k

c

c

and converge almost Proof The Law of Large Numb ers implies that

T

c

c

p

surely to and resp ectivel y Hence by continuity b

T

T

p

when k This fact and converges almost surely to

T

p

p

kb

k b Prop osition yield that if k then converges in dis

b

plus some small error term The desired conclusion tribution to a N or m

follows immediately

Remark Theorem provides an approximation to the distribution of b The

approximation error arises from three sources The rst one is the use of the de

MoivreLaplace Theorem to express a B inom in terms of a N or m

The second one is due to the use of the Central Limit Theorem to approximate the

distribution of an estimator by its limit distribution The nal source of error

is due to the use of the Law of Large Numbers to approximate an estimator by

its asymptotic value These three sources of approximation error can be bounded

through the de MoivreLaplace Theorem BerryEssens inequalityHazewinkel

pp and Chebyshevs inequalityRoss pp respectively A bound on

the accumulated approximation error shows that Theorem is fairly accurate

n is negligible and k is suciently Corollary If jI j jI j n

k

large then

h i

b

O P jb j jj P j j j j

k

Proof The b ound concerning b follows from Theorem and Chebyshevs in

b

equalityRoss pp In order to prove the other b ound recall that

said that X converges in distribution to X as n if P X x P X x as n for

n n

all p oints x at which F x P X x is continuous

X

5

Recall that when X X X are random variables on some probability space F P it is

1 2

said that X converges almost surely to X as n if f X X as n g

n n

is an event whose probability is

 Alejandro Hevia and Marcos Kiwi

c b and thus

T T

h i

b

P j j j j P jc b j j j

T T T

P jc j j j P jb j jj

T T T

V b V c

T

T

where the last inequality is a consequence of applying Chebyshevs inequality twice

Moreover V c Note that from Theorem we have that V b

T

k

h i

i

V n The result follows T

k k

Corollary tells us that with probability at least it suces to take n time

measurements for each of O dierent keys to approximate and to within

a multiplicative factor of

Unknown s The assumption that the s are known made in the previous

i

i

section is not strictly necessary since an attacker may alternate b etween p erforming

several timing measurements over a short p erio d of time and resting for an appro

priately long p erio d of time Hence the problem of deducing the target systems

design characteristic s reduces to the case in which the s are known provided that

i

the keys are not changed to o often and the attackers resting p erio d is longer than

a keys lifetime Changing keys to o often creates a key management problem for

the cryptosystems user Thus it is reasonable to assume that a keys lifetime is

not excessively short

We now discuss another approach for handling the case of unknown s under

i

the assumption that the attacker has access to several identical copies of the target

system eg several copies of a smart card supp orting a DES based challenge

resp onse proto col Lets make the reasonable assumption that the target systems

keys are indep endently generated In this case the attacker may p erform over

a short p erio d of time several timing measurements for each copy of the target

system If the keys are not changed to o often the attacker can deduce the target

systems relevant timing characteristics as in Sect Indeed the attacker can

assume that all the timing measurements arising from the same copy of the system

come from encryptions p erformed under the same key Since keys corresp onding

to dierent copies of the target system are indep endently generated and the copies

of the system are identical the problem of deducing the target systems design

characteristics reduces to the case in which the s are known

i

Tests of statistical hyp othesis give rise to another alternative for handling the case

of unknown s Indeed consider the situation in which an attacker determines m

i

timing measurements t t arising from random variables satisfying As

m

sume keys are not changed to o often ie at least n m timing measurements

come from encryptions p erformed under the same key Thus for each j such that

n j m n the attacker can p erform a test of equality of two normal distri

butionsHogg and Tanis pp on the samples of t t and

j n j

t t The signicance level of such tests allows the attacker to determine

j j n

the measurements around where a change of key o ccurs Discarding the measure

ments around where the attacker susp ects a change of key o ccurs yields a sequence

Strength of Two Data Encryption Standard Implementations under Timing Attacks 

of timing measurements from which the target systems design characteristics can

b e deduced as in the case of known s

i

FINAL COMMENTS

In Ko cher a blinding technique similar to that used for blind signatures

Chaum is prop osed in order to prevent a timing attack against a mo dular

exp onentiator For b oth implementations of DES we studied blinding techniques

can b e adapted to pro duce almost xed running time for the key schedule genera

tion pro cesses Indeed let K b e the DES key of Hamming weight wtK whose key

schedule we want to generate Let K b e a bitstring of length generated as fol

wtK wtK

c resp ectivel y d e of the bits of K which are lows randomly cho ose b

set to resp ectivel y and set the corresp onding bits of K to resp ectively

Denote the bitwise xor of K and K by K K Note that wtK wtK K

when the Hamming weight of K is even and wtK and wt K K when

the Hamming weight of K is o dd Mo dify the key schedule generation pro cesses so

key schedules for keys K and K K are generated Note that the work required

for this is indep endent of the Hamming weight of K Hence no sources of nonxed

running time are intro duced during this step Let K K and K K b e

the key schedules obtained Recall that K resp ectivel y K is a p ermuted se

i

i

lection of bits from the key K K resp ectivel y K Thus the key schedule of

K is K K K K Figure plots the encryption times of RSADES

as previously explained Note the very clear reduction in time dierentials The

reduction is achieved at the exp ense of increasing the encryption time by a factor

of approximately Unfortunately this blinding technique still leaks the parity

of the weight of the original DES key ie bit of information A careful lo ok at

Fig conrms this fact This fact can b e xed using the idea develop ed ab ove

Indeed for a given DES key K one can generate three DES keys K K and K

two of them with Hamming weight and one with Hamming weight Of the

three keys one will b e spurious meaning that its key schedule should b e generated

and the results discarded The xor of the key schedules generated by the two non

spurious keys will give rise to the key schedule sought When the Hamming weight

of the original DES key is even resp ectivel y o dd the spurious key will b e the one

of Hamming weight resp ective ly one of the keys of Hamming weight

We have seen that the main source of nonxed running times were caused by

the key schedule generation pro cedure In many fast software implementations

key setup is an op eration which is separated from encryption This would thwart a

timing attack if encryption time is constant But in several systems it is impractical

to precompute the key schedule For example in smart cards precomputations are

undesirable due to memory constraints

Overall b oth DES implementations we studied are fairly resistant to a timing

attack This leads us to the question of whether a timing attack can nd all of the

DES key and not only its Hamming weight Although we did not succeed in tuning

the timing attack technique in order to recover all the bits of a DES key we identied

in LDES a source of nonxed running time that is not due to the key generation

pro cess Indeed the dierence in the slop es of the curves plotted in Fig shows

that the encryption time not counting the key generation pro cess dep ends on the

key used This fact is a weakness that could p otentially b e exploited in order to

 Alejandro Hevia and Marcos Kiwi

750

Blinded RSA−DES 700

650

600

550

RSA−DES 500 Time [microseconds]

450

400

350 0 10 20 30 40 50

Hamming weight of key

Fig RSADES and mo died RSADES encryption times

recover all of the DES key It op ens the p ossibility that the time it takes to encrypt

a message M with a key K is a nonlinear function of b oth M and K eg it is a

monotonically increasing function in the Hamming weight of M K This would

allow a timing attack to recover a DES key by carefully cho osing the messages to b e

encrypted We were not able to identify clear sources of nonlinear dep endencies

b etween time dierentials and the inputs to the DES encryption pro cess in either

of the DES implementations that we studied Nevertheless we feel that the partial

information leaked by b oth implementations of DES that we analyzed suggests that

care must b e taken in the implementation of DES otherwise all of the key could

b e compromised through a timing attack

ACKNOWLEDGMENTS

We are grateful to ShangHua Teng for calling to our attention the work of Ko cher

We thank Raul Gouet Luis Mateu Alejandro Murua and Jaime San Martin for

helpful discussions We also thank Paul Ko cher for advise on how to measure run

TM

ning times accurately on an MSDOS environment Finally we thank an anony

mous referee for p ointing out that the blinding technique of Sect was leaking

information ab out the parity of the Hamming weight of the key

APPENDIX

A APPENDIX

TM

Standard C routines allow to measure time events in an MSDOS environment

with an accuracy of s Heidenstrom In order to measure with a time

TM TM

precision of s on a Pentium computer running MSDOS we followed

Ko chers advice Ko cher He suggested reading the value of a highprecision

timer by accessing p ort Whenever this timer overows to it generates

one interrupt Interrupts o ccur once every s Hence one can measure time

intervals with a precision of s s It is also a go o d idea

to run from a RAM disk For more information on how to p erform accurate time

Strength of Two Data Encryption Standard Implementations under Timing Attacks 

measurements on the PC family under DOS the reader is referred to Heidenstrom

Performing the timing measurements in a Windows or Unix environment is

clearly a bad idea since b oth of them are multipro cess op erating systems

REFERENCES

Biham E and Shamir A Dierential cryptanalysis of DESlike cryptosystems

Journal of Cryptology

Biham E and Shamir A Dierential cryptanalysis of the full round DES In

E F Brickell Ed Advances in Cryptology CRYPTO Numb er in Lecture

Notes in Computer Science Santa Barbara California pp Springer

Verlag

Biham E and Shamir A Dierential fault analysis of secret key cryptosystems

Technical Rep ort CS Technion Computer Science Department

Boneh D Demillo R A and Lipton R J On the imp ortance of checking

cryptographic proto cols for faults In Advances in Cryptology EUROCRYPT Lecture

Notes in Computer Science pp SpringerVerlag

Chaum D Blind signatures for untraceable payments In D Chaum R L Rivest

and A T Sherman Eds Advances in Cryptology CRYPTO Santa Barbara Cali

fornia pp Plenum Press

Dhem JF Koeune F Leroux PA Mestre P Quisquater JJ and Willems

JL A practical implementation of the timing attack In JJ Quisquater and

B Schneier Eds Proc CARDIS Smart Card Research and Advanced Applications

LNCS SpringerVerlag

Diffie W and Hellman M E New directions in cryptography IEEE Transactions

on Information Theory IT Nov

English E and Hamilton S Network security under siege The timing attack

Computer March

Feller W An introduction to probability theory and its applications Volume I

I I John Wiley Sons Inc second printing

Handschuh H and Heys H A timing attack on RC In S Tavares and H Meijer

Eds Proc SAC Selected Areas in Cryptography Numb er in LNCS pp

SpringerVerlag

Hazewinkel M Ed Encyclopaedia of Mathematics Volume Kluwer Academic

Press An up dated and annotated translation of the Soviet Mathematical Encyclopaedia

Heidenstrom K FAQapplication notes Timing on the PC family under DOS

Ver Rel ftpgarbouwasafipcprogrammingpctimzip

Hogg R and Tanis E Probability and statistical inference fth ed Prentice Hall

Kapp J S A RSAEuro A cryptographic to olkit Ver Internet Rel Distrib

Kocher P Timing attacks on implementations of DieHellman RSA DSS and

other systems In N Koblitz Ed Advances in Cryptology CRYPTO Numb er

in Lecture Notes in Computer Science Santa Barbara California pp

SpringerVerlag

Kocher P Private communication

Louko A DES package Helsinki University of Technology Computing Centre

Ver ftpkampihutfi

Markoff J Potential aw seen in cash card security New York Times

Matsui M a The rst exp erimental cryptanalysis of the data encryption standard

In Y G Desmedt Ed Advances in Cryptology CRYPTO Numb er in Lecture

Notes in Computer Science Santa Barbara California pp SpringerVerlag

Matsui M b metho d for DES cipher In T Helleseth Ed

Advances in Cryptology EUROCRYPT Numb er in Lecture Notes in Computer

Science Lofthus Norway pp SpringerVerlag

 Alejandro Hevia and Marcos Kiwi

Menezes A J van Oorschot P C and Vanstone S A Handbook of Applied

Cryptography rst ed CRC Press

N B of Standards Data encryption standard DES FIPS Publication

Rivest R L Shamir A and Adleman L A metho d for obtaining digital signa

tures and publickey cryptosystems Communications of the ACM

Ross S A rst course in probability third ed Macmillan Publishing Company

Schneier B Applied Cryptography protocols algorithms and source code in C sec

ond ed John Wiley Sons Inc

Stinson D R Cryptography Theory and Practice rst ed CRC Press

Zacks S The theory of statistical inference John Wiley Sons Inc