<<

An Efficient Cryptographic Protocol Verifier Based on Prolog Rules

Bruno Blanchet

INRIA Rocquencourt Domaine de Voluceau B.P. 105 78153 Le Chesnay Cedex, France [email protected]

Abstract protocol to guarantee the termination of the verification pro- cess. If there exists an attack that only appears with more We a new automatic cryptographic protocol veri- runs of the protocol, it will not be discovered. Our solution fier based on a simple representation of the protocol by Pro- to these problems relies on two ideas: log rules, and on a new efficient algorithm that determines whether a fact can be proved from these rules or not. This a simple intermediate representation of the protocols; verifier proves secrecy properties of the protocols. Thanks

a new efficient solving algorithm. to its use of unification, it avoids the problem of the state space explosion. Another advantage is that we do not need We use Prolog rules to represent the protocol and the at-

to limit the number of runs of the protocol to analyze it. We tacker. Messages and channels are represented by terms; the

er M have proved the correctness of our algorithm, and have im- fact attack means that the attacker has the message

plemented it. The experimental results show that many ex- M ; rules give implications between such facts. This can be amples of protocols of the literature, including Skeme [24], considered as an abstraction of the multiset rewriting [16] can be analyzed by our tool with very small resources: the or of the linear logic representation [19]. We perform two analysis takes from less than 0.1 s for simple protocols to interesting abstractions: 23 s for the main mode of Skeme. It uses less than 2 Mb of memory in our tests. Fresh values considered as functions other messages in the protocol. To give an intuition, when the attacker does not modify messages, different values are used for each pair of participants of the protocol, instead of 1. Introduction per session.

The design of cryptographic protocols is difficult and We forget the number of times a message appears to re- error-prone. This can be illustrated by flaws found in ex- member only that it has appeared. A step of the proto- isting protocols [1, 6, 11, 25]. It is therefore important to col can be executed several times instead of only once have tools to verify the properties of cryptographic proto- in each session. cols. Several techniques can be used to build such tools: These are keys to avoid limiting the number of runs of pro- logics, such as the BAN logic [11] used in [23], theo- tocols: the number of repetitions is simply forgotten. Our rem proving, used in Isabelle [33], rank functions [21], approximations are safe, in the sense that if the verifier does typing [2, 12, 22], abstract interpretation [7, 8, 20, 29], not find a flaw in the protocol, then there is no flaw. The ver- model checking, rewriting, and related techniques, used in ifier therefore provides real security guarantees. In contrast, Elan [13], Brutus [14], Maude [16], FDR [25], NRL [26], it may give a false attack against the protocol. However, the Interrogator [27], Mur [28], Athena [35]. Most exist- false attacks are rare, and we have been able to prove the ing protocol verifiers based on model checking suffer from secrecy properties of all the protocols that we have consid- the problem of the state space explosion, and they need very ered. Therefore, we believe that our abstractions could also large resources to verify even relatively simple protocols. be useful in other protocol tools. Various protocol repre- Moreover, in general, they limit the number of runs of the sentations can be translated into our simple representation.

This work was partly done while the author was at Bell Labs Research, We have built an automatic translator from a restricted ver- Lucent Technologies. sion of the applied pi calculus [5], that only handles certain equational theories, including the theories used to repre- M N terms

sent shared- and public- ( and x variable

M M

signatures), hash functions, the Diffie-Hellman key agree- a n

name

M M

ment. It is also possible for the user to enter directly the f n function application

rules representing the protocol, since the representation is

simple enough. F fact

pM M n Using this representation, we have built a tool to prove predicate application

secrecy properties of protocols. Indeed, the attacker may

R rule

attackerm

have a given message m only if the fact can

F F F n be proved from the rules representing the protocol and the implication abilities of the attacker. However, the usual Prolog solving algorithm loops, due to rules that appear in the description of the attacker. Therefore, we have designed a solving al- Figure 1. Syntax of our protocol representa- gorithm. This algorithm is novel as far as we know, and it tion appears to be very efficient in practice. We have applied it to prove secrecy properties of several protocols of the liter- ature, including Skeme [24], or to find attacks against them. Bodei et al. [7, 8] is not relational: when a variable ap- pears several times in a message, each occurrence can take different values (in the set of values of the variable). All Related work Prolog rules and similar formalisms have nonces generated by the same restriction are also considered already been used in a number of works on cryptographic as equal. These are causes of imprecision that our analy- protocols, for example [16, 26, 27]. We propose a more ab- sis solves. Moreover, Bodei’s analysis only handles shared- stract representation of the protocols, that enables us to de- key cryptography. The main difference between Monniaux’ sign a simpler and faster analysis, and to avoid limiting the analysis [29] and ours is that Monniaux represents sets of number of runs of the protocol, thus improving over most messages by tree automata, whereas we represent rules that model checkers. Of course, the additional approximations generate these sets. This enables us to gain in efficiency. imply that our analysis may not be able to prove that cer- Goubault [20] extends and improves the efficiency of [29], tain protocols are secure (it may give false attacks), but in also using ideas of Bolignano [9]. However, [20] does not our experiments our analysis was precise enough to prove allow any term to be used in place of a key. We do not have secrecy properties of the protocols we have considered. A this limitation. In our experimental results, we obtain an key problem of previous tools using Prolog rules is termina- even faster analysis with a simpler framework. tion. Our solving algorithm is a big step towards a solution Theorem provers are not fully automatic tools: the user of this problem. has to intervene to provide information on the proof. Previ- Two works have already tackled the problems met by ous works using typing are also better suited for human use classical model checkers. Broadfoot, Lowe and Roscoe [10, than for automatic verifiers: type inference is sometimes 34] do not limit the number of runs of protocols. They re- difficult [4] and types are in general human-readable. Types cycle nonces, to use only a finite number of them in an infi- provide constraints that can help the designers of new pro- nite number of runs. We achieve the same result by directly tocols ensure the desired security properties, but existing reusing the same values for nonces. However, they limit the protocols may not satisfy these constraints even if they are number of parallel runs of protocols. We avoid this limita- correct. In contrast, our analysis yields a fully automatic tion. They also allow the attacker to simulate honest agents, protocol verifier. but use this technique only for servers. We generalize it to all agents involved in the protocols. In their work as well as in ours, the deduction rules for the attacker must have Overview Section 2 details our protocol representation. only equalities in their hypotheses, no inequalities (that is, Section 3 describes our solving algorithm, and sketches its they are positive deduction systems). Song [35] avoids the proof of correctness. Several extensions and optimizations state space explosion problem by using the strand space to this algorithm are detailed in Section 5. Section 6 gives model to verify protocols. This model captures the causal experimental results and Section 7 concludes. information of messages, in a way similar to our deduction rules. However, our model is more abstract than Song’s. 2. Protocol representation She sometimes limits the number of runs of the protocol to guarantee termination, whereas we avoid this limitation. A protocol is represented by a set of Prolog rules Automatic protocol verifiers have already been built by (clauses), whose syntax is given in Figure 1. The terms rep- using abstract interpretation [7, 8, 20, 29]. The analysis of resent messages that are exchanged between participants of

sk A

the protocol. A variable can represent any term. Names are is, there exists only one copy of this name) A for and

sk B pk pksk pk pksk

A B

used to represent atomic values, such as keys and nonces. B for . Then and .

A B Each principal has the ability of creating new names. Here, More generally, we consider two kinds of functions: the created names are considered as functions of the mes- constructors and destructors. The constructors are the

sages previously received by the principal that creates the functions that explicitly appear in the terms that repre- pk name. Thus, a different name is created when the preced- sent messages. For instance, p encrypt and are con-

ing messages are different. This is slightly weaker than the structors. Destructors manipulate terms. A destructor g

fact that a new name is created at each run of the proto- can be defined by one or several equations of the form

g M M M M M M

n n col. As noticed by M. Abadi (personal communication), where are terms this approximation is in fact similar to the approximation that contain only variables and constructors. For in-

done in some type systems (such as [4]): the type of the stance, the decryption p decrypt is a destructor, defined by

p encrypt m pksk sk m new name depends on the types in the environment. It is p decrypt . Other functions enough to handle many protocols, and can be enriched by are defined similarly:

adding other parameters to the name. The function appli-

signm sk cation is used to build terms: examples of functions are the For signatures, there is a constructor

encryption, or hash functions. Predicates are used to repre- that is used to represent the message m signed un- getmess

sent facts about these messages. Several predicates can be der the secret key sk . A destructor de-

signm sk m

used, but for a first example, we are going to use only one fined by getmess returns the mes-

signm sk

sage without its signature, and checksign

er M

predicate attack , meaning “the attacker may have the

sk m

pk only returns the message if the signature

M F F F n message ”. A rule means that if all

is valid.

F F F n

facts are true, then is also true. A rule with

F F no hypothesis is written simply .

For shared-key encryption, we have a construc-

We can illustrate the coding of a protocol on the follow- sdecrypt tor sencrypt and a destructor , defined by

ing simple example (this is a simplification of the Denning-

sencryptm k km sdecrypt .

Sacco key distribution protocol [17], omitting certificates h and timestamps): A hash function is represented by a constructor (and

no destructor).

g A B ffk g pk

Message 1. sk

A

B

B A fsg

k n

Message 2. Tuples of arity are represented by a construc-

n ith

tor and destructors defined by

A B sk

There are two principals and . A is the secret key of

ithx x x i f ng

n i

, .

A pk sk pk B

, its public key. Similarly, B and for . The

A B

A A key k is a new key created by . sends this key signed

2.2. Representation of the abilities of the attacker sk

with its private key A and encrypted under its public key B pk . When receives this messages, it decrypts the mes- B We assume that the protocol is executed in the presence sage, and assumes, seeing the signature, that the key k has

of an attacker that can intercept all messages, compute new s been generated by A. Then it sends a secret encrypted

messages from the messages it has received, and send any A under k (this is a shared-key encryption). Only should message it can build. We first present the encoding of the be able to decrypt the message and get the secret s. (The second message is not really part of the protocol, we use it computation abilities of the attacker. The encoding of the protocol will be detailed below. to check if the key k can really be used to exchange secrets

During its computations, the attacker can apply all con- B

between A and . In fact, there is an attack against this n structors and destructors. If f is a constructor of arity , protocol [11], so s will not remain secret.) this leads to the rule:

2.1. Representation of primitives

attackerx attackerx

n

attackerf x x n Cryptographic primitives are represented by functions.

For instance, we represent the public-key encryption by a

g g M M M n

If is a destructor defined by , this

m pk

function pencrypt which takes two arguments: the leads to the rule: pk

message m to encrypt and the public key . There is a

pk

erM attackerM attackerM

function that builds the public key from the secret key. attack

n sk (We could also have two functions pk and to build respec-

tively the public and secret keys from a secret.) The secret If g is defined by several equations, there are several rules, key is represented by a name which has no arguments (that one for each equation. The destructors never appear in the k

rules, they are coded by pattern-matching on their param- the generated keys k are certainly different: depends on

M M pkx n eters (here ) in the hypothesis of the rule and . The rule becomes:

generating their result in the conclusion. In the particular

er pkx

case of the public-key encryption, this yields: attack

attacker p encrypt signk pkx sk pkx

A

erm attackerpk

attack (2)

attacker p encryptm pk

A

ersk attacker pksk attack Remark. It would also be possible for to initiate the

protocol itself, by choosing randomly the other principal to

erp encrypt m pksk attackersk

attack (1) which it talks, instead of letting the attacker initiate the pro-

attacker m tocol. In this case, the rule above would be

where the first two rules correspond to the constructors

attackerp encrypt signk pkx sk pkx

A pk p encrypt and , the last rule corresponds to the destruc-

tor pdecrypt. When the attacker has an encrypted message

where x is a variable that represents the secret key of the

m pk sk p encrypt and the decryption key , then it also has principal talking with A. However, if we want to represent the plaintext m. (We assume that the cryptography is per- the protocol by a closed process in the applied pi calculus,

fect, hence the attacker can only obtain the plaintext from A the variable x must come from an input. That is, cannot the encrypted message if it has the key.) choose randomly the principal to which it talks. If the pro- For signatures, we obtain the rules: cess is modeled in the applied pi calculus, the attacker sends

a message that indicates with which principal A should talk.

er m attacker sk attacker signm sk

attack This yields the rule (2) given above.

attacker signm sk attackerm

p encrypt signk sk B expects a message of the form

where the first rule corresponds to the constructor sign and

pksk

B . When such a message is received, it tests

the second one to the destructor getmess. (The rule for B

whether A has signed the message (that is, evaluates

checksign is removed, since it is implied by the rule for

signk sk pk

checksign , and this only succeeds when

A

getmess.)

sk sk k

A ). If so, it assumes that the key is only known

The rules above describe the abilities of the attacker.

s k by A, and sends a secret encrypted under . We assume Moreover, the attacker has the public keys. Therefore, we

that the attacker relays the message coming from A, and

attacker pksk attacker pksk B add the rules A and . intercepts the message sent by B . Hence the rule:

We also give a name a to the attacker, that will represent all

attackera

attackerp encrypt signk sk pksk B

the names it can generate: . A

attackersencryptsk

2.3. Representation of the protocol itself B With these rules, A cannot play the role of and vice-versa.

Now, we describe how the protocol itself is represented. If we want that, we can simply add the corresponding rules,

A B B

We consider that and are willing to talk to any princi- that are obtained by swapping A and in the above rules: B

pal, A, but also malicious principals that are represented

attackerpkx

by the attacker. Therefore, the first message sent by A can

p encryptsignk sk pkx x

attackerp encrypt signk pkx sk pkx

be A for any . We leave to

B B

the attacker the task to start the protocol with the principal

attackerp encrypt signk sk pksk

B A

it wants, that is the attacker will send a first message to A,

attackersencrypts k A mentioning the public key of principal with which A should

talk. This principal can be B , or another principal repre-

More generally, a protocol that contains n messages is

sented by the attacker. (The attacker can create public keys,

X i encoded by n sets of rules. If a principal sends the th by the rule for constructor pk.) Moreover, the attacker can

message, the ith set of rules contains rules that have as hy- intercept the message sent by A. This yields a rule of the potheses the patterns of the messages previously received form

by X in the protocol, and as conclusion the pattern of the

i

er pkx attack th message. There may be several possible patterns for the

previous messages as well as for the sent message, in partic-

attackerp encrypt signk sk pkx A

ular when the principal X uses a destructor which is defined

Moreover, a new key k is created each time the protocol is by several equalities. In this case, a rule must be generated

x A run. Hence, if two different keys pk are received by , for each combination of possible patterns. Moreover, notice that the hypotheses of the rules describe all the messages Freshness is modeled by letting new names be func- previously received, not only the last one. This is impor- tions of messages previously received by the creator of tant since in some protocols the fifth message for instance the name in the run of the protocol. When the attacker can contain elements received in the first message. The hy- does not alter messages, this means that different val- potheses summarize the history of the exchanged messages. are used per pair of principals running the proto- col instead of per session. When the attacker does al- Remark. When the protocol makes some communications ter messages, different values are used when different on private channels, on which the attacker cannot a priori messages are received.

listen or send messages, a second predicate can be used:

messC M M meaning “the message can appear on chan- A step of the protocol can be completed several times,

nel C ”. In this case, if the attacker manages to get the name as long as the previous steps have been completed at

of the channel C , it will be able to listen and send messages least once between the same principals. For instance,

on this channel. Thus, two new rules have to be added to de- in a session between the attacker and a principal A,

M A

scribe the behavior of the attacker. The attacker can listen the attacker sends the first message , replies with M on all the channels it has:  , the attacker can then send two messages in place

of the third message, as long as they correspond to the

messx y attackerx attackery

A M

pattern expected by . That is, the attacker sends 

M A M

and gets  from , then the attacker sends and

It can send all the messages it has on all the channels it has: 

A

gets M from . The attacker can also perform again



i

erx attacker y messx y attack the th step, even if further steps have already been per- formed. Therefore, the actions of the principals are not 2.4. Summary organized into runs.

To sum up, a protocol can be represented by three sets of But the important point is that the approximations are al- Prolog rules: ways performed in the safe direction: if an attack exists in a more precise model, such as multiset rewriting [16], or the

Rules representing the computation abilities of the applied pi calculus [5], then it also exists in our represen-

attackerx

attacker. There is one rule tation. (We are currently proving that our translation from

attackerx attackerf x x

n

n for each the applied pi calculus to this representation has this cor-

f attacker M

constructor , and one rule rectness property.) Performing approximations enables us

attackerM attackerM

n for each equation to build a much more efficient verifier, which will be able to

g M M M g n defining a destructor . handle larger and more complex protocols. Another advan- tage is that the verifier does not have to limit the number of

Facts corresponding to the initial knowledge of the at-

runs of the protocol. The price to pay for this is that false

era tacker. There is a fact attack giving a name to attacks may be found by the verifier: sequences of rule ap- the attacker. In general, there are also facts giving the plications that do not correspond to a protocol run. When public keys of the participants and/or their names to a false attack is found, we cannot know whether the proto- the attacker. col is secure or not: a real attack may also exist. A more

Rules representing the protocol itself. There is one precise analysis is required in this case. But our representa- set of rule for each message in the protocol. In the tion is precise enough so that false attacks are rare. (This is

set corresponding to the ith message, sent by prin- demonstrated by our experiments, see Section 6.)

X attacker M

cipal , the rules are of the form j

M attacker M attacker M

j j

i where , ,

n 2.6. Secrecy

M X

j are the patterns of the messages received by

n

i M

before sending the th message, and i is the pattern Our goal is to determine secrecy properties: for instance,

of the ith message.

can the attacker get the secret s ? That is, can the fact

ers attackers The rules representing the Denning-Sacco protocol are sum- attack be inferred from the rules ? If

marized in Figure 2. can be inferred, the sequence of rules applied to derive

ers attack will lead to the description of an attack. 2.5. Approximations Our notion of secrecy is similar to that of [4, 7, 12]: a

term M is secret if the attacker cannot get it by listening The reader can notice that our representation of protocols and sending messages, and performing computations. This is approximate: notion of secrecy is weaker than non-interference, but it is

Computation abilities of the attacker:

p encrypt attackerm attackerpk attacker p encryptm pk

pk attackersk attackerpksk

p decrypt attackerp encrypt m pksk attacker sk attackerm

sign attackerm attackersk attackersignm sk

getmess attackersignm sk attackerm getmess

checksign removed since implied by

sencrypt attackerm attackerk attackersencryptm k

sdecrypt attackersencryptm k attackerk attackerm

Initial knowledge of the attacker:

attackerpksk attackerpksk attackera

A B

Protocol:

attackerpkx attackerp encrypt signk pkx sk pkx

First message: A

attackerp encrypt signk sk pksk attackersencrypt sk B Second message: A

Figure 2. Summary of our representation of the Denning-Sacco protocol

adequate to with the secrecy of fresh names. Non- Definition 2 (Derivability) Let F be a closed fact, that is, F interference is better at excluding implicit information flows a fact without variable. Let B be a set of rules. is deriv-

or flows of parts of compound values. (See [3, Section 6] able from B if and only if there exists a finite tree defined as

for further discussion and references.) follows:

F F n

Technically, the hypotheses of a rule are

B 1. Its nodes (except the root) are labelled by rules R ; considered as a multiset. This means that the order of the hypotheses is irrelevant, but the number of times an hypoth- 2. Its edges are labelled by closed facts; esis is repeated is important. (This is not related to the ideas

of multiset rewriting: the semantics of a rule does not de- 3. If the tree contains a node labelled by R with one in-

F n

pend on the number of repetitions of its hypotheses, but coming edge labelled by  and outgoing edges la-

F F R fF F g F

n n  considering multisets is useful to make explicit the elimi- belled by , then . nation of duplicate hypotheses in our verifier. It will also

4. The root has one outgoing edge, labelled by F .

be useful in the proof of the algorithm.) Formally, a mul-

S B

tiset of facts is a function from facts to integers, such Such a tree is a derivation of F from .

F F S that S is the number of repetitions of in . The in-

clusion on multisets is the point-wise order on functions: In a derivation, if there is a node labelled by R with one

F n

incoming edge labelled by  and outgoing edges labelled

S F S F S F f

S .If is a function from

F F R F

n 

facts to facts, we can extend it to multisets of facts by by , then the rule can be used to infer from

P

F F F

n

. Therefore, there exists a derivation of from

f S F S F

. This applies in

f F F

F such that

F B B if and only if can be inferred from rules in . particular when f is a substitution. We determine whether a given formula can be implied by a given set of rules. This is more precisely defined as 3. Solving algorithm follows. Our representation is a set of Prolog rules, and our goal is simply to determine whether a given fact can be inferred Definition 1 We define rule implication by: from these rules or not. This is exactly the problem that is

solved by usual Prolog systems. However, we cannot use

H C H C

  if and only if

these systems here, because they would not terminate. For

C C H H

 

instance, the rule:

H H

attacker p encryptm pksk attackersk  where and are multisets of hypotheses, is a substi-

tution.

attacker m

R R R

 

We write when can be obtained by adding leads to considering more and more complex terms, with R

hypotheses to a particular instance of . In this case, all an unbounded number of . We could of course

R R facts that can be derived by  can also be derived by . limit arbitrarily the depth of terms to solve the problem, but

B B we can do much better than that. Indeed, even when limiting Let be the rule base,  be the set of rules representing the depth of terms, the complexity of the depth-first search the attacker and the protocol.

will be very large. There are many rules with conclusion We define

erx

attack that can always be applied, when we search a

addR B

attackerM

fact of the form . We can get better results with

R B R R

a more guided search. B if ,

R g fR B jR R g The main idea to guide the search is to combine pairs f otherwise.

of rules by unification, when the unified facts are not of

attackerx R

H C H C the form . When the consequence of a rule We also define elimdup ,

unifies with one hypothesis of another (or the same) rule where is the multiset which contains one copy of each

R

F F elimdup

, we can infer a new rule, that corresponds to applying fact: . The function eliminates the

R R and one after the other. Formally, this is defined as duplicate hypotheses from a rule.

follows:

R B B addelimdupR B

1. For each  , .

R R H C

Definition 3 Let R and be two rules, ,

B R H C R B R H

2. Let R , and ,

R H C F H

. Assume that there exists  such

C F H

. Assume that there exists  such that:

C F

that: and  are unifiable, and is the most general

C F



R unifier of and . R

(a) F is defined;

In this case, we define 

F H F S

(b) r ;

F S

R H H F C R

 r 

F (c) . 

In this case, we execute R

For example, if R is the rule (2), and is the rule (1), the

F F attacker p encryptm pksk R R

 F

fact  is ,



B addelimdupR R B F

is 

erpkx attacker x

attack This procedure is executed until a fixed point is

attackersignk pkx sk

reached.

A

B fH C B jF H F S g

3. Let r .

fsk x m signk pkx

with the substitution

sk g R R A

. In terms of logic programming, F is the



R R F result of resolving with upon  . Of course, if this op- Figure 3. First phase: completion of the rule eration is applied without limitations, it does not terminate base (consider the same rule as above). We specify conditions to

limit it. As far as we know, these conditions are new.

S F S

Let be a finite set of facts. We define r by:

R F the hypotheses of contain only facts which sat-

there exists a substitution mapping variables to variables

F S

isfy r (i.e. by default they are of the form

S S fattackerxg

such that F . By default, , but the

erx attack ),

algorithm is also correct with other values of S . The idea is

F F S

to only unify facts such that r . The precise formal

F R

and the hypothesis  of that we unify does not

condition is slightly more complex (see below).

F S F

r  satisfy  (i.e. by default is not of the form

The solving algorithm works in two phases, which are

erx attack ). described in Figures 3 and 4 respectively. The first phase transforms the rule base into a new one, that implies the When a rule is created by this composition, it is added to the same facts. The second one uses a depth-first search to de-

rule base B , after simplification (duplicate hypotheses are

termine whether a fact can be inferred or not from the rules.

R R removed and if a rule R implies a rule , is removed). The first phase (Figure 3) contains 3 steps. The first step

At last, the third step is to extract from B the new rule base

inserts in B the initial rules representing the protocol and F

B , by taking only the rules whose all hypotheses sat- B

the attacker (rules that are in  ). These rules are simpli-

F S F

isfy r (by default, this means that is of the form

fied by eliminating duplicate hypotheses, and if a rule R

erx

attack ). The following remarks can help understand

R add implies a rule R , is removed (definition of ). The the algorithm:

second step is a fixed point iteration, that adds rules created

R by resolution. The composition of rules R and is only This algorithm corresponds to a kind of forward added if search. In a forward search, a fact is unified with an

R B hypothesis of a rule, and a new rule is created that con- We define derivablerec by

tains one hypothesis less. This is performed until no

R B R B R R

new fact can be inferred. 1. derivablerec if ;

S R

C B fC g Here, assume . Hypothesis (b) means that 2. derivablerec otherwise;

has no hypothesis, that is, R is a fact. Hypothesis (c)

R B

3. derivablerec

C

is always true. Then a fact R is unified with an

fderivablerecelimdupR R fR g B jR

F 

hypothesis of the rule R . In this case, we have exactly

B F R g R  such that F is defined otherwise.

a forward search. 

S

F derivablerecfF g F When , the algorithm resembles a forward derivable .

search for a modified notion of facts. Let S -facts be the

H C F H F S S

rules , where r . Let -rules be Figure 4. Second phase: backward depth-first

H F F H F S F

r S

the rules S where and is search S

an S -fact. Notice that all rules are -rules, simply by

F S

writing first the hypotheses that satisfy r . Hy-

S

pothesis (b) means that R is an -fact. Hypothesis (c)

R R R

rules have been combined by F , yielding rule .In



F S R

means that  is an hypothesis of an -rule . The

R R F

this case, we replace R and by in the derivation of .

S S S -fact and -rule are combined, to give a new -rule When no more replacement can be done, we show that all

(that can be an S -fact). When all combinations have

F F S

 r

the hypotheses  of the remaining rules satisfy .

B been performed, only S -facts are kept in .

Then all these rules are in B , and we have built a derivation

B of F from . The converse is easy to prove.

This algorithm is similar to an unfolding of the logic

program [37], but there is one important difference: In The second phase (Figure 4) searches the facts that

R F R

the unfolding, a rule and an hypothesis  of are can be inferred from B . This is simply a backward

chosen, and the resolution is performed with all rules depth-first search. The search is performed by calling

R F

whose consequence is unifiable with  . Here, the

R B R

derivablerec with two parameters: a rule and a

resolution is only applied with rules R whose hypothe- R

set of rules B . The hypotheses of are the facts that we

F F S ses satisfy r . currently want to prove. Its conclusion is an instance of the

fact F that we initially wanted to prove. Moreover, the rule

The speed of the algorithm comes essentially from the

B

R is always a consequence of the rules in : the conclusion

C

fact that there are not many rules H such that

B

of R can be proved by rules of from the hypotheses of

F H fC gF S

r . (In our uses of this al-

B R . The set is the set of rules that we have already seen

gorithm, S is always a very small set.) For the other

fF g F B

during the search. derivablerec returns the

C S

rules, (b) implies r , therefore, using the default

set of instances of F that can be proved.

C attackerx C

definition of S , is not of the form :

B If R is implied by a rule in , the current branch of the cannot be unified with any hypothesis of any rule, only

search fails: this is a cycle, we are looking for instances of F

few hypotheses of rules will correspond. Similarly, 

facts that we have already looked for (first point in the defi-

er x is not of the form attack , so only few conclusions

nition of derivablerec). We backtrack to try finding another F

of rules can be unified with  . Therefore, in general,

derivation of the goal. If R has no hypothesis, the search few unifications are performed, and the algorithm is

succeeds: the conclusion of R is proved (second point of very fast.

the definition of derivablerec). Otherwise, we have to go

R B

We prove that the rules in B imply exactly the same facts on searching. We try to use rule to prove one of

R F derivablerec  B the hypotheses of , . That is, we call with

as the rules in  .

R F R 

the rule F (in which has been replaced by the



R R R

Lemma 1 (Correctness of phase 1) Let F be a closed hypotheses of , and being instantiated so that the

R F



B F F conclusion of and are unified).

fact. is derivable from the rules in  if and only if is derivable from the rules in B . The following theorem gives the correctness of the whole algorithm. It shows that we can use our algorithm to de- Proof sketch We only give a proof sketch here, a detailed termine whether a fact is derivable or not from the initial

proof can be found in Appendix A. rules. The first part of the theorem shows that when calling

F B derivable F

Assume that is derivable from  and consider a with a not necessarily closed fact , the instances

F B F

derivation of from  . The key idea of the proof is the of that can be derived from the rules are the instances of

R derivableF following. Assume that the rules R and are applied one the facts returned by . The second part deals

after the other in the derivation of F . Also assume that these with the particular case of closed facts.

F

Theorem 2 (Correctness) Let F be a closed fact. Let An example of such an exploding rule is:

F F

such that there exists a substitution such that .

attackerf x attackerf g x

F B F

is derivable from the rules in  if and only if

F F F

derivable . A solution to this non-termination case is of course to add

F B F

F S In particular, is derivable from  if and only if

 to the set , thus forbidding the above combinations

derivableF . of the rule R . Another solution is to add a new rule, that

implies all the previous rules, for n large enough. For ex- Proof sketch Using Lemma 1, we only have to prove

ample, let be a renaming of variables that has an image

F B F

that is derivable from if and only if

disjoint from the variables appearing in H . Then the rule

derivableF F F n

.



C H n n

implies all the previous rules for  .

Essentially, derivablerec performs a classical depth-first

When such a rule is present, the previous rules are automat- search of the rule base B to find the desired fact. This ically removed, and the non-termination is avoided. From search is stopped in case of cycle. F is derivable if and only the point of view of abstract interpretation, adding a new if it is found by the depth-first search. The detailed proof rule in this way can be considered as a widening [15], that can be found in Appendix A. we use to force the convergence of the fixed point iteration. This non-termination case can be detected automatically 4. Termination by the verifier.

4.1. Termination of the basic algorithm 4.3. Enforcing termination

The fixed point iteration of the first phase does not al- Termination can be enforced by limiting the depth of ways terminate, even if it terminates in most examples of terms. Each term that starts at a depth greater than a limit protocols. We will see below several ways to force its ter- fixed by the user is replaced by a new variable. mination. This way, if a fact can be generated by the system with-

The following proposition shows that the depth-first out depth limitation, it can also be generated by the sys- search of the second phase terminates on the rule base B tem with depth limitation. The converse is of course wrong. built by the first phase. The system remains correct (if it says that a protocol does

not leak certain secrets, then the protocol definitely does not

S fattackerxg

Proposition 3 If F is closed and , then leak these secrets), but some precision is lost.

F

derivable terminates. (Otherwise, the termination of In practice, the algorithm terminates for many protocols

F derivable is not guaranteed.) without limiting the depth of terms. That is why, by default,

the depth of terms is not limited in our tool. Moreover, lim-

Proof sketch The hypotheses of the rules in B are iting the depth of terms that appear in the rules does not smaller than the conclusion. Hence the depth-first search limit the depth of terms that can be built by the attacker, considers smaller and smaller terms, and thus terminates. since even with rules of bounded depth, the attacker can cre-

The proof can be found in Appendix B. ate terms of unbounded depth. Therefore, even if limiting the depth of terms in the rules leads to approximations, it is 4.2. Detecting (and solving) some non-termination more precise than limiting the depth of terms in the usual cases

depth-first search algorithm of Prolog.

R fF g C

Assume that a rule  is inferred, with

5. Optimizations and extensions

C F x x fv x x x

 , where is such that ,

M M where fv is the set of variables in the term . Assume

5.1. Tuples

F S r

that  , and there exists a derivation of an instance of

F H F 

 (more precisely, the verifier generates a rule ,

M M

n

The tuples are denoted by . Tuples of dif-

F H F S whose hypotheses satisfy r ). Then, in general, the completion process (phase 1) does ferent arity are considered as different functions. But the user does not have to define each of these functions: they

not terminate. Indeed, the rule R can be combined with the

are all built-in.

H F H C H F  rule  , yielding ,

The attacker rules:

H

then this rule can combined again with R , yielding



C H F

 . We can go on this way, and obtain

attacker M attacker M

n

n

n H F

for all integers :  , and in general none of

attackerM M n

these rules implies another, therefore all these rules will be

attacker M M attackerM

n i generated by the solving algorithm.

are also built-in, and treated in a specially optimized way. 5.4. Diffie-Hellman key agreement

attackerM M n

Indeed, these rules mean that

i f ng

is derivable if and only if , The Diffie-Hellman key agreement [18] enables two

attackerM

i is derivable. When a fact of the principals to build a . It is used as an elemen-

attacker M M n

form is met, it is replaced tary step in more complex protocols, such as Skeme [24].

attacker M attacker M n

by . If this Formally, the Diffie-Hellman key agreement can be mod- g

replacement is done in the conclusion of a rule eled by using two functions f and that satisfy the equation

H attackerM M n n

, rules are created:

y gx f x g y

f (3)

H attackerM i f ng

i for each . This M

replacement is of course done recursively: if i itself is a

x

x y y mo d p In practice, the functions are f and

tuple, it is replaced again.

x

x p

g where is prime and is a generator of

x y z x y z x y z

x y y x

Notice that , and are differ- 

f y gx mo d p Z . The equation

ent terms. Tuples are different from the concatenation. This p

p f x g y mo d is satisfied. In our verifier, follow- can have important consequences. For instance, the Otway- ing the ideas used in the applied pi calculus [5], we do not Rees protocol [32] is flawed when using concatenation, not consider the underlying number theory; we work abstractly when using tuples. Similarly, the simplified version of the with the equation (3). The Diffie-Hellman key agreement

Yahalom protocol of [11] is correct from the point of view

B A involves two principals A and . chooses a random

of secrecy when using tuples, whereas it is flawed when us-

x g x B B  name  , and sends to . Similarly, chooses a

ing concatenation [36, Attack 1]. (The second attack of [36]

x g x A A random name , and sends to . Then com-

exposes an flaw, not a secrecy problem.) Of

f x gx B f x gx

 putes  and computes . Both course, the implementation of the protocol must correspond values are equal by (3), and they are secret: assuming that

to the model of the verifier.

x x

the attacker cannot have  or , it can compute neither

f x gx f x gx

  nor . 5.2. Removing useless rules and useless hypotheses The equation (3) cannot be written directly in our frame- work that uses only constructors and destructors. Neverthe-

less, it can be encoded as follows: the constructors are g ,

If a rule has a conclusion which is already in the hy-

h h f  , and , and is a destructor defined by

potheses, this rule does not generate new facts. Such rules

f y gx h x y

are therefore removed as soon as they are encountered by

our verifier.

f x g y h x y

C attacker x

If a rule H contains in its hypotheses ,

f x y h x y 

where x is a variable that does not appear elsewhere in

attackerx

the rule, the hypothesis can be removed. In- Notice that this definition of f is non-deterministic: a term

a g b h a b h b a

deed, the attacker always has at least a message. Therefore f

such as can be reduced to , , and

attackerx

a g b M M

is always satisfied. h

  . When two terms and are equal accord-

ing to the equation (3) then there exists a common term M

M M M 

5.3. Secrecy assumptions such that both and reduce to . (Both sides reduce

h x y g

to when there is at least one in the second param-

f h x y

eter of , and to  otherwise.) The equation is then When the user knows that a fact will not be derivable, modeled correctly.

he can tell it to the verifier. (When this fact is of the form

erM M attack , the user tells that remains secret.) The tool 5.5. Key compromise then removes all rules which have this fact in their hypothe- ses. At the end of the computation, the program checks that The weakness of some protocols is that when an attacker the fact is indeed underivable from the obtained rules. If manages to get some session keys, then it can also get the the user has given erroneous information, an error message secrets of other sessions. Such a problem appears for exam- is displayed. Even in this case, the verifier never wrongly ple in the Needham-Schroeder shared-key protocol. It can claims that a protocol is secure. be detected by our protocol verifier. Mentioning such underivable facts prunes the search The strategy to model the compromise of some session space, by removing useless rules. This speeds up the search keys is as follows. We say that a name is a session name if process. In most cases, the secret keys of the principals can- it is created at each session of the protocol. (In general, all

not be known by the attacker. So, examples of underivable names except long-term secret keys are session names.) We

attacker sk attackersk a B facts are A , , add a parameter (session identifier) to each session name .

a s a

The session identifier of is a given constant  when has 6. Experimental results

been created during a session compromised by the attacker.

s a The session identifier is when has been created in a ses- We have implemented our verifier in Ocaml, and have

sion that has not been compromised. We define a predicate performed tests on a Pentium MMX 233 MHz, under Linux

compM

comp such that is true when all session names 2.0.32 (RedHat 5.0). The results are summarized in Fig-

M s in have session identifier  . This can be defined by the ure 5, with references to the papers that describe the proto- following rules: cols and the attacks. In these tests, the protocols are fully modeled, including interaction with the server for all ver-

For each constructor f , sions of the Needham-Schroeder, Denning-Sacco, Otway-

compx compx compf x x

k k Rees, and Yahalom protocols. We use secrecy assumptions to speed up the search. These assumptions say that the se- For each session name a,

cret keys of the principals, and the random values of the

compx compx compas x x

k  k Diffie-Hellman key agreement and the session keys in the

For each non-session name a, Skeme protocol, remain secret. Thanks to these secrecy as-

compx compx compax x

k k sumptions, the analysis time of Skeme is 23 s instead of

70 s. The column “#Rules” indicates the number of Prolog

attacker

We define a predicate  by the rules normally used rules in our representation of each protocol. The large num- s

to encode the protocol, with session identifier  , and a ber of rules for the Needham-Schroeder shared-key proto-

attacker

predicate by the same rules with session identi- col comes from the encoding of the compromise of session s fier . Then we add rules keys. In the Needham-Schroeder shared key protocol, the

last messages are

compx compx

k

B A fN g K

Message B

attacker as x x

  k

A B fN g K Message B

for each session name a. These rules express that the at-

N

attacker s

where B is a nonce. Representing this with a function 

tacker  has the names of session identifier .

xx Moreover, we add the rule minusone , this yields a loop in our verifier, since

it believes that 1 can be subtracted any number of times

attacker x attacker x

N

 from B . Moreover, the techniques mentioned previously

to force termination lead to a false attack. The purpose of B

The intuitive meaning of the predicates is the following: the subtraction is to distinguish the reply of A from ’s

attacker M M

 is true if and only if can be obtained by message. As mentioned in [6], it would be clearer to have: s

the attacker by compromising the sessions of identifier  ;

B A f N g

B K

er M M

attack Message Message

is true if can be obtained by the attacker in

A B f N g

a non-compromised session, using the knowledge obtained K Message Message B in the compromised sessions. We can then use our tool to

We use this encoding. Our tool then terminates, and the

attacker ss s query the fact , where is a session secret. If this fact is underivable, then the protocol does not have analysis is precise. There was no other termination problem the weakness mentioned above: the attacker cannot have the in the tests of Figure 5. These results show that our analysis can be used to verify secret s of a session that it has not compromised. In con-

secrecy properties of standard cryptographic protocols, in a

attacker ss  trast, it is normal that , since the attacker

small amount of time. It takes only a very small amount of s has compromised the sessions of identifier  . Our translation from the applied pi calculus can auto- memory (less than 2 Mb in all these tests). matically add the rules described above to model the com-

promise of session keys. Also notice that our solving algo- 7. Conclusion

S fcompx attacker x attacker xg rithm uses  in this case. We believe that our protocol verifier can provide new

possibilities to verify cryptographic protocols: it is very er

Remark. We could also use a single predicate attack in- efficient, and thus can handle complex protocols; it also

attacker attacker stead of  and . However, this would yield avoids limiting the number of runs of the protocol. This is a less precise model, leading to more false attacks. For ex- achieved by using a simple representation of the protocol, ample, we could not prove that the corrected version of the and a new solving algorithm. Needham-Schroeder shared key protocol [31] is secure with Directions for further work include generalizing the tool this model. to be able to handle general equational theories. A more Protocol Result #Rules Time (ms) Needham-Schroeder public key [30] Attack [25] 14 70 Needham-Schroeder public key corrected [25] Secure 14 60 Needham-Schroeder shared key [30, 11] Attack [17] 47 760 Needham-Schroeder shared key corrected [31] Secure 51 1190 Denning-Sacco [17] Attack [11] 15 40 Denning-Sacco corrected [11] Secure 15 40 Otway-Rees [32] Secure 9 270 Otway-Rees, variant of [33] Attack [33] 9 260 Yahalom [11] Secure 10 110 Simpler Yahalom [11] Secure 10 310 Main mode of Skeme [24] Secure 23 23070

Figure 5. Experimental results general study of the termination of the algorithm would also [7] C. Bodei. Security Issues in Process Calculi. PhD thesis, be interesting, to find conditions that guarantee the termina- Universit`a di Pisa, Jan. 2000. tion of the basic algorithm, and new ways of forcing termi- [8] C. Bodei, P. Degano, F. Nielson, and H. R. Nielson. Control nation when these conditions are not satisfied. Flow Analysis for the -calculus. In International Confer- ence on Concurrency Theory (Concur’98), volume 1466 of Lecture Notes on Computer Science, pages 84Ð98. Springer Acknowledgments Verlag, Sept. 1998. [9] D. Bolignano. Towards a Mechanization of Cryptographic This work owes much to discussions with Mart«õn Abadi. Protocol Verification. In O. Grumberg, editor, 9th In- I am very grateful to him for what he taught me. I would like ternational Conference on Computer Aided Verification to thank the anonymous reviewers for their helpful com- (CAV’97), volume 1254 of Lecture Notes on Computer Sci- ments and suggestions. ence, pages 131Ð142. Springer Verlag, 1997. [10] P. Broadfoot, G. Lowe, and B. Roscoe. Automating Data Independence. In 6th European Symposium on Research in References (ESORICS 2000), volume 1895 of Lec- ture Notes on Computer Science, pages 175Ð190, Toulouse, [1] M. Abadi. Explicit Communication Revisited: Two New France, Oct. 2000. Springer Verlag. Attacks on Authentication Protocols. IEEE Transactions on [11] M. Burrows, M. Abadi, and R. Needham. A Logic of Au- Software Engineering, 23(3):185Ð186, Mar. 1997. thentication. Proceedings of the Royal Society of London A, [2] M. Abadi. Secrecy by Typing in Security Protocols. Journal 426:233Ð271, 1989. A preliminary version appeared as Dig- of the ACM, 46(5):749Ð786, Sept. 1999. ital Equipment Corporation Systems Research Center report [3] M. Abadi. Security Protocols and their Properties. In No. 39, February 1989. F. Bauer and R. Steinbrueggen, editors, Foundations of [12] L. Cardelli, G. Ghelli, and A. D. Gordon. Secrecy and Group Secure Computation, NATO Science Series, pages 39Ð60. Creation. In C. Palamidessi, editor, CONCUR 2000: Con- IOS Press, 2000. Volume for the 20th International Sum- currency Theory, volume 1877 of Lecture Notes on Com- mer School on Foundations of Secure Computation, held in puter Science, pages 365Ð379. Springer Verlag, Aug. 2000. Marktoberdorf, Germany (1999). [13] H. Cirstea. Specifying Authentication Protocols Using [4] M. Abadi and B. Blanchet. Secrecy Types for Asymmet- Rewriting and Strategies. In I. Ramakrishnan, editor, Prac- ric Communication. In F. Honsell and M. Miculan, editors, tical Aspects of Declarative Languages (PADL’01), volume Foundations of Software Science and Computation Struc- 1990 of Lecture Notes on Computer Science, pages 138Ð tures (FoSSaCS 2001), volume 2030 of Lecture Notes on 152, Las Vegas, Nevada, Mar. 2001. Springer Verlag. Computer Science, pages 25Ð41, Genova, Italy, Apr. 2001. [14] E. M. Clarke, S. Jha, and W. Marrero. Using State Space Springer Verlag. Exploration and a Natural Deduction Style Message Deriva- [5] M. Abadi and C. Fournet. Mobile Values, News Names, and tion Engine to Verify Security Protocols. In Proceedings Secure Communication. In 28th Annual ACM SIGPLAN- of the IFIP Working Conference on Programming Concepts SIGACT Symposium on Principles of Programming Lan- and Methods (PROCOMET), June 1998. guages (POPL’01), pages 104Ð115, London, United King- [15] P. Cousot and R. Cousot. Comparing the Galois Connection dom, Jan. 2001. ACM Press. and Widening/Narrowing Approaches to Abstract Interpre- [6] M. Abadi and R. Needham. Prudent engineering practice tation. In M. Bruynooghe and M. Wirsing, editors, Proceed- for cryptographic protocols. IEEE Transactions on Software ings of the fourth international symposium PLILP’92 (Pro- Engineering, 22(1):6Ð15, Jan. 1996. gramming Language Implementation and Logic Program- ming), Lecture Notes on Computer Science, pages 269Ð295. [30] R. M. Needham and M. D. Schroeder. Using Encryption for Springer Verlag, Aug. 1992. Authentication in Large Networks of Computers. Commun. [16] G. Denker, J. Meseguer, and C. Talcott. Protocol Specifi- ACM, 21(12):993Ð999, Dec. 1978. cation and Analysis in Maude. In N. Heintze and J. Wing, [31] R. M. Needham and M. D. Schroeder. Authentication Re- editors, Proc. of Workshop on Formal Methods and Security visited. Operating Systems Review, 21(1):7, 1987. Protocols, Indianapolis, Indiana, 25 June 1998. [32] D. Otway and O. Rees. Efficient and Timely Mutual Au- [17] D. E. Denning and G. M. Sacco. Timestamps in Key Dis- thentication. Operating Systems Review, 21(1):8Ð10, 1987. tribution Protocols. Commun. ACM, 24(8):533Ð536, Aug. [33] L. C. Paulson. The Inductive Approach to Verifying Cryp- 1981. tographic Protocols. Journal of Computer Security, 6(1Ð [18] W. Diffie and M. Hellman. New Directions in Cryptography. 2):85Ð128, 1998. IEEE Transactions on Information Theory, IT-22(6):644Ð [34] A. W. Roscoe and P. J. Broadfoot. Proving Security Pro- 654, Nov. 1976. tocols with Model Checkers by Data Independence Tech- [19] N. A. Durgin, P. D. Lincoln, J. C. Mitchell, and A. Scedrov. niques. Journal of Computer Security, 7(2, 3):147Ð190, Undecidability of bounded security protocols. In Workshop 1999. on Formal Methods and Security Protocols (FMSP’99), [35] D. X. Song. Athena: a New Efficient Automatic Checker for Trento, Italy, 5 July 1999. Security Protocol Analysis. In Proc. of 12th IEEE Computer [20] J. Goubault-Larrecq. A Method for Automatic Crypto- Security Foundation Workshop (CSFW-12), Mordano, Italy, graphic Protocol Verification (Extended Abstract), invited June 1999. paper. In Fifth International Workshop on Formal Meth- [36] P. Syverson. A Taxonomy of Replay Attacks. In Proceed- ods for Parallel Programming: Theory and Applications ings of the 7th IEEE Computer Security Foundations Work- (FMPPTA’2000), Canc«un, Mexique, May 2000. Springer- shop (CSFW-94), pages 131Ð136, 1994. Verlag. [37] H. Tamaki and T. Sato. Unfold/Fold Transformation of [21] J. Heather and S. Schneider. Towards automatic verifi- Logic Programs. In S. Akeû T¬arnlund, editor, Proceedings cation of authentication protocols on an unbounded net- of the Second International Logic Programming Conference work. In 13th IEEE Computer Security Foundations Work- (ICLP’84), pages 127Ð138, Uppsala, Sweden, July 1984. shop (CSFW-13), pages 132Ð143, Cambridge, England, July 2000. A. Proof of correctness of our algorithm [22] M. Hennessy and J. Riely. Information Flow vs. Resource Access in the Asynchronous Pi-Calculus. In Proceedings

of the 27th International Colloquium on Automata, Lan- Lemma 4 At the end of the first phase, B satisfies the fol-

guages and Programming, Lecture Notes on Computer Sci- lowing properties:

ence, pages 415Ð427. Springer Verlag, 2000.

R B R B R R

1.  ;

[23] D. Kindred and J. M. Wing. Fast, Automatic Checking of

B R H C R B R H

Security Protocols. In USENIX 2nd Workshop on Electronic 2. Let R , and ,

Commerce, pages 41Ð52, Nov. 1996.

C F H

. Assume that there exists  such that:

[24] H. Krawczyk. SKEME: A Versatile Secure

R R

Mechanism for Internet. In Proceedings of the Internet So- (a) F is defined; 

ciety Symposium on Network and Distributed Systems Secu-

F H F S

(b) r ;

rity, Feb. 1996. Available at http://bilbo.isu.edu/

F S r

sndss/sndss96.html. (c)  .

[25] G. Lowe. Breaking and Fixing the Needham-Schroeder

R B R R R

In this case, there exists , F . Public-Key Protocol using FDR. In Tools and Algorithms 

for the Construction and Analysis of Systems, volume 1055

R B

Proof To prove the first property, let  . We show

of Lecture Notes on Computer Science, pages 147Ð166.

R B R that during the whole execution of phase 1, Springer Verlag, 1996.

R .

[26] C. Meadows. A Model of Computation for the NRL Pro- At the beginning, we execute the instruction B

tocol Analyzer. In Proceedings of 1994 Computer Security

elimdupR B R B

Foundations Workshop (CSFW-7), Franconia, New Hamp- add . If there exists no such that

elimdupR elimdupR B

shire, June 1994. IEEE Computer Society. R , is added to .Wehave

R R

[27] J. K. Millen, S. C. Clark, and S. B. Freedman. The Inter- elimdup (with the identity). Therefore, after the

R B R R

rogator: Protocol Security Analysis. IEEE Transactions on execution of this instruction .

addR B

Software Engineering, SE-13(2):274Ð288, Feb. 1987. Assume that we execute B , and before

[28] J. C. Mitchell, M. Mitchell, and U. Stern. Automated Analy-

R B R R R B

this execution . Either is kept in , 

sis of Cryptographic Protocols Using Mur .InProceedings R then this property is true after the execution of add.Or

of the 1997 IEEE Symposium on Security and Privacy, pages

R R R is removed, and R . Then ( is transitive) 141Ð151, 1997. [29] D. Monniaux. Abstracting Cryptographic Protocols with and the property is still satisfied.

Tree Automata. In Static Analysis Symposium (SAS’99), The second property simply means that the fixed point is

elimdupR R

reached at the end of phase 1 (using F

volume 1694 of Lecture Notes on Computer Science, pages 

R R

149Ð163. Springer Verlag, Sept. 1999. F ). 

R R R R R R n n H H C

Lemma 5 If F is defined, and to ) are labelled by elements of and .



F R R n C

F

then either there exists such that is defined And the incoming edge of is labelled by .

R R R R R R R

F F F

and ,or .

 

n

In the second case, we remove , and link directly its

n R

H C R H C R H

R incoming and outgoing edges to .Wehave

Proof Let , ,

H C H C H H C C

C

R H C

C ,

, . By renaming the variables, we

n R

R and outgoing edges of are now labelled by elements

can arrange such that the variables of and are dis-

H H C C

C C of , its incoming edge by .

tinct. Then there exists a substitution such that ,

H H C C H H

, , .

We perform this replacement process as long as there exist

R H H F C R  We have F .We  nodes on which it can be applied. Once the replacement

have two cases.

process is done, we show that the remaining rules are all in

R F H F F R

 F

First case: . Since is



B

.

F C

defined,  and are unifiable, let be the most general

F C F C

unifier. , then and are unifiable, B

The rules labelling leaves of the tree are all in since

R R

F

therefore is defined. Let be the most general

they have no hypotheses.

unifier. There exists such that .Wehave

n n

R H R H H F C

F

, Let be a node such that all sons of are labelled by

B

H F H H F H H F



, a rule in . Therefore, the hypothesis (b) is satisfied

n n F R

C C C R R R R

F

. Then F . for all sons of (the hypotheses of the rule la-



n F S n n

H H F H H H r

Second case:  . Then belling satisfy ). Since and cannot have

F C C R R R F

 and . Therefore . been merged with another node by the above replace-



ment process, hypothesis (c) is not satisfied for all sons

n n F n 

Lemma 1 (Correctness of phase 1) Let F be a closed of . Then all hypotheses of the rule labelling

F S n

 r

B F F satisfy . That is, is also labelled by a rule of

fact. is derivable from  if and only if is derivable

B from B . .

By induction, this proof shows that all nodes are labelled by

F B

Proof Assume that is derivable from  and consider

a rule of B , which is the expected result.

F B R B 

a derivation of from  . For each rule in , there

For the converse implication, notice that if a fact is deriv-

B R R

exists a rule R in such that (Lemma 4, Property

B 1). able from B then it is derivable from , and that all rules

added to B do not create new derivable terms: when com-

Assume that R is the label of a node with an incoming

R

posing two rules R and , the created rule can derive terms

F n F F n

edge labelled and outgoing edges labelled .

R

that could also by derived by R and .

R fF F g F R n

We have . Then

fF F g F n

( transitive).

F Theorem 2 Let F be a closed fact. Let such that

Therefore, we can replace the node labelled R by a node

F F F

there exists a substitution such that .

F B

labelled R . This way, we obtain a derivation of from .

B F

is derivable from the rules in  if and only if n

Assume that there are two nodes n and in this deriva-

F F F

derivable .

F n n C

tion of , linked by an edge from to labelled . As-

F B F

In particular, is derivable from  if and only if

R n R H

sume that n is labelled and is labelled . Let

F

derivable . H

be the set of labels of outgoing edges of n, the same

C n

for n , the label of the incoming edge of . Then

Proof Using Lemma 1, we only have to prove

H C H C C

is defined (with a substitu-

B F

that F is derivable from if and only if F

tion being the identity). By Lemma 5, there exists

derivableF F F

.

R R

such that F is defined, and two cases may arise:

Let us prove the direct implication. We consider a deriva-

R R H C H C

C

either F ,or

F B

tion of from . We cut this derivation on certain edges,

R H C H C C .

and remove the branches that start from these edges. We call

F F F

n

In the first case, assume that the hypotheses (b) and the remaining part a partial derivation of . Let

R R

(c) of Lemma 4, Property 2 are satisfied. Then there be the labels of the cut edges. We prove that

fF F gF derivablerec R B derivableF

R B R R R R

n

exists such that F . Then

R B R R

H C H C H H C

C

and . The proof is by induction on the

n n

C ( transitive). Then the two nodes and can number of nodes in the partial derivation.

R

be replaced by a node n labelled . Indeed, the If there are no nodes in the partial derivation, that is

n R outgoing edges of n and (excluding the edge from we have cut the edge starting from the root, let

F g F B derivablerecR B derivablerecR B R

f , .Wehave Proof is only called with

derivableF fF g F R attackerM attacker M F n

hence the result. or

M M n

For the induction step, consider a partial derivation with where are closed terms, or a variable that ap-

n

k nodes. Let be a node of this derivation whose pears only once. This is proved by induction in the fol-

all outgoing edges have been cut (a leaf of the partial lowing. Moreover, we prove that the pair p total

M M

n R n

derivation). Assume that is labelled by , that its in- size of the that are closed terms, number of

M M

F

n

coming edge is labelled by , its outgoing edges by that are variables ordered lexicographically

F F

. The other edges that have been cut to build strictly decreases. This decrease proves the termination.

n

R fF g F

F F n

the partial derivation are labelled by  . By in- At the beginning, the rule is indeed .For

derivablerec R R

duction hypothesis on the partial derivation without node recursive calls to , the rule is F , where



R attackerx attacker x F

n R R fF F g

k n

, there exists such that .

derivablerecR B derivableF R

F , and F

1. First case:  is a closed fact.

B R R R

. We show that there exists f

F F x i

After unification of and  , is substituted by a

R fF F F F g F

 n

such that f ,

n

N x F x

i i

closed term i if appears in . Otherwise, remains

derivablerecR B derivableF R

f and

N x i

unchanged, and we define i .

B R R R

f . By definition of a derivation,

R fF g F R R

fF F g F

F

If , the resulting rule is

. Notice that the composition 

n

attackerN attackerN F

fF F g F fF F g F

k

F n

.

n

fF F F F g F

n

 is defined (the unifier

n

R attackerM attacker M n Otherwise,

being the identity). Then by Lemma 5, two cases may

F F attackerM M

 i i

, . Assume that is a closed

F F g R fF F

n

arise. First case: 

n

attackerN

term. The resulting rule is

F R

, and the expected result is obvious with f

attackerN attacker M attackerM

k i

R F R

. Second case: there exists such that F

attackerM attackerM F

i n

. More-

R fF F R F F g F

 n

 . Let

n

N N

k

over, the that are closed terms are dis-

elimdupR R R F

. Then, by transitivity of , 

M

i

joint subterms of , therefore the total size of the

fF F F F g F

n

 . By the step (c) of the

n

N N

k

that are closed terms is strictly smaller

derivablerec derivablerecR fR gB

definition of , 

M R attackerx

i

than the size of (except when

derivableF R fR g B R R



.If ,we

attackerx R R R

, but in this case, F , and the



R R 

have the expected result with f . Otherwise,

derivablerecR R fR g B

F

call stops immedi-



R fR g B R R



. By transitivity of ,

R R R fR g B

F

ately because , by the first



F F g F F R fF

 n

. There is an older

n

point of the definition of derivablerec). Therefore the

derivablerec derivablerecR B call to , of the form , such

total size of the closed terms in the hypotheses strictly

derivablerecR B derivableF B

that .If satisfies

decreases. Hence the pair p ordered lexicographically

R B R R R R

 f  , we have the result with . strictly decreases.

Otherwise, we go on taking a previous call to derivablerec

B

attackerM as above. The process terminates, since is finite. R

2. Second case:

erM F F attacker M M x

We can apply the result we have just proved to the par- attack

 i i i

n , , .

ticular case when the partial derivation is in fact the whole If R has some hypotheses, the resulting rule is

F R R F

R attackerx attackerx

derivation of . We obtain and R

F k



derivablerecR B derivableF R B R

erM attackerM attacker M

, attack

i i

R R F F F

attackerM F R R R F

. Therefore , with . n . We clearly have .



derivablerecR B fF g

derivablerecR fR g B . (The case (a) of the defi- R

Therefore the call F stops



derivablerec

R R R fR g B nition of cannot be applied because of the con- R

immediately because F and .



R B R R F derivableF

R R R

dition .) Then . If has no hypothesis, the resulting rule is F



erM attackerM attacker M

We have the expected result. attack

i i

attackerM F

The proof of the converse inclusion is left to the reader. n , and the total size of closed terms

R R

(essentially, the rule F does not generate facts that in the hypotheses is constant, whereas the number of vari-

R R cannot be generated by applying and ). ables strictly decreases. Hence the pair p ordered lexico-

graphically strictly decreases.

B. Termination

attackerx attacker x

Remark. The cases R and

F attackerx i

 are removed by the optimizations of

S fattackerxg

Lemma 3 If F is closed and , then Section 5.2.

F derivable terminates.