<<

Routing Information Proto col in HOLSPIN

Karthikeyan Bhargavan Carl A Gunter and Davor Obradovic

UniversityofPennsylvania

Abstract Weprovide a pro of using HOL and SPIN of for

the Routing Information Proto col RIP an proto col based on

distance vector routing We also calculate a sharp realtime b ound for this

convergence This extends existing results to deal with the RIP standard

itself which has complexities not accounted for in theorems ab out ab

stract versions of the proto col Our work also provides a case study in

the combined use of a higherorder theorem prover and a mo del checker

The former is used to express abstraction prop erties and inductions and

structure the highlevel pro of while the latter deals eciently with case

analysis of nitary prop erties

Intro duction

The high connectivity on which the Internet relies is enabled by scalable and

robust proto cols that enable routers connecting dierent physical networks to

forward packets toward destinations describ ed in a uniform addressing system

The rst Internet routing proto cols were based on distance vector routing which

uses information ab out distance and direction to a destination to route packets

The rst such proto col standardized by the Internet Engineering Task Force

IETF was the Routing Information Proto col RIP and this proto col remains

in widespread use to day Although the correctness of distance vector routing has

b een proved for theoretical versions of the algorithm the RIP standard itself

has never b een proved to have some of the prop erties it is exp ected to p ossess

Since there exist nontrivial dierences between the abstract version and the

standard itself pro ofs of some key prop erties of the standard are worthwhile

In this pap er we carry out the pro of of convergence using a combination of the

HOL higherorder theorem prover and the SPIN mo del checker

The automated assistance reduces the burden of case analysis in parts of the

standard where manual analysis would prove tedious Moreover the HOLSPIN

pro of provides high condence for RIP and insights into the techniques needed

to address other routing proto cols most of which are more complex than RIP

Routing proto cols are meant to b e robust with resp ect to failures of links and

routers If there is a failure then the routers communicate this information and

routing tables are up dated to route around the failed link or This pro cess

takes some time since routers cannot p ossess instantaneous global knowledge

Email addresses bkarthiksaulcisupennedu guntercisupennedu davorsaulcisupennedu

of network characteristics They therefore pass information that is incomplete

and if the proto col has the rightcharacteristics they eventually converge on a

suitable set of alternative routes We have two results we show that the RIP

proto col will converge after a failure and we calculate a sharp realtime b ound on

the time this will take as a function of the radius of the network Both results are

based on assumptions ab out network reliability and timing assumptions sp ecied

in the RIP proto col

The rst pro of concerns the convergence of the asynchronous distributed

BellmanFord proto col as sp ecied in the IETF RIP standard The classic

pro of of a pure form of the proto col is given in Our result covers additional

features included in the standard to improve realtime resp onse times eg split

horizons and p oison reverse These features add additional cases to b e consid

ered in the pro of but the automated supp ort reduces the impact of this complex

ity Adding these extensions makes the theory b etter match the standard and

hence also its implementations Our pro of also uses a dierent technique from

the one in providing some noteworthy prop erties ab out network stability

Our second pro of provides a sharp realtime convergence b ound on RIP in

terms of the radius of the network around its no des In the worst case the

BellmanFord proto col has a convergence time as bad as the number of nodes

in the network However if the maximum number of hops any source needs

to traverse to reach a destination is k the radius around the destination and

there are no link changes then RIP will converge in k timeout intervals for this

destination From our rst pro of of convergence it is easy to see that this o ccurs

within k intervals but the pro of of the sharp b ound of k is complicated

by the numb er of cases that need to b e checked weshowhow to use automated

supp ort to do this verication based on the approach develop ed in the previous

case study supplemented byanewinvariant Thus if a network has a maximum

radius of around each of its destinations then it will converge in at most

intervals even if the network has no des Assuming the timing intervals in

the RIP standard such a network will converge within minutes if there are

no link changes

The basis of our verication is the RIP standard Early implementations of

distance vector routing were incompatible so all of the routers running RIP in

a domain needed to use the same implementation Users and implementors were

led to correct this problem byproviding a sp ecication that would dene precise

proto cols and packet formats leading to the rst version of the standard In

time this standard was revised to a second version Atthelevel of abstraction

we use here our pro of is applicable to b oth of these versions

There havebeena variety of successful formal studies of communication pro

to cols However most of the studies to date havefocusedon endpoint proto cols

that is proto cols b etween pairs of hosts using mo dels that involvetwo or three

pro cesses representing the endp oints or the endp oints and an adversary for

instance Studies of routing proto cols must have a dierentavor since a pro of

that works for two or three routers is not interesting unless it can b e general

ized Routing proto cols generally have the following attributes which inuence

the way formal verication techniques can b e applied

An essentially unb ounded numb er of replicated simple pro cesses execute

concurrently

Dynamic connectivity is assumed and fault tolerance is required

Pro cesses are reactive systems with a discrete interface of mo dest complexity

Real time is imp ortant and many actions are carried out with some timeout

limit or in resp onse to a timeout

Most routing proto cols have other attributes such as latencies of information ow

limiting for example the feasibility of a global concept of time and the need to

protect network resources These attributes sometimes make the proto cols more

complex For instance the asynchronous version of the BellmanFord proto col

is much harder to prove correct than the synchronous version and the RIP

standard is still harder to prove correct b ecause of the addition of complicating

optimizations intended to reduce latencies

Following this intro duction wegive a description of the Routing Information

Proto col as sp ecied in its standard We then describ e our formalization of RIP

in HOL and SPIN In the fourth and fth sections we show the convergence

of RIP and derive a sharp realtime b ound for the convergence In the sixth

section weprovide some analysis of our metho dology including a discussion of

the b enets of automation and some crude measurements of the complexityof

the pro ofs as viewed by the automated to ols and the p erson carrying out the

verication resp ectively Our nal section summarizes the conclusions

Routing Information Proto col

The RIP proto col sp ecication is given in and a go o d exp osition can b e

found in Westartby describing the general networking environment and the

task of a routing proto col Then wegive a brief description of the RIP proto col

including its pseudo co de App endices A A Finallywe discuss dierences

between the standard and the underlying theory and the way they aect proto col

requirements

Routing in Internetworks

An internet is a family of networks connected by routers Figure illustrates

an internetwork with four networks shown as clouds and four routers shown

as black squares The goal of the routers is to forward packets b etween hosts

shown as circles that are attached to the networks The routers use routing

tables which they develop through running a distributed Packets

from hosts travel in hops across networks linked by routers Each router cho oses

a link on whichtoforward the packet based on the packets destination address

and other parameters In order to b e able to make good forwarding decisions

routers need to maintain partial top ology information in the routing tables n1 i1 r1 i3 r4

h1 n3 n4

h2 i2

r2 r3

n2

Fig An Internet

The aim of a routing proto col is to establish a pro cedure for up dating these

tables In most cases routing information can be exchanged only local ly ie

between neighb oring routers However the overall goal of a routing proto col

is to establish good global paths b etween distant hosts on the internet An

interface is the link between a router and a network In this example router

r has interfaces i i and i which connect it to the networks n n and

n resp ectively Hosts h and h b elong to the network n Routers are said to

be neighbors if they haveinterfaces to a common network In our example all

routers are neighb ors of r but r and r are not neighb ors

Routing Information Proto col

Each RIP router maintains a A routing table contains one entry

p er destination network representing the current b est route to the destination

An entry corresp onding to destination d has the following elds

hopsnumber of hops to d ie total numb er of routers that a message sent

along that route traverses b efore reaching the network d this includes the

router where this entry resides This is sometimes called a metric for d

nextR next router along the route to d

nextIface the interface that will b e used to forward packets addressed to d

It uniquely identies the next network along the route

Routers p erio dically advertise their routing tables to their neighbors Upon

receiving an advertisement the router checks whether any of the advertised

routes can be used to improve current routes Whenever this is the case the

router up dates its current route to go through the advertising neighb or Routes

are compared exclusively by their length measured in the number of hops

The value of hops must be an integer between and where has the

meaning of innity a destination with hops attribute set to is considered

unreachable Hence RIP will not be appropriate for that contain a

router and a destination network that are more than hops apart from each other

App endices A and A show pseudo co de for RIP A router advertises its

routes by broadcasting RIP packets to all of its neighb ors A RIP packet contains

a list of destination hopspairs A receiving router compares its current metric

for destination to hops which is the metric of the alternative route and

up dates the corresp onding routing entry if the alternative route is shorter There

is one exception to this ruleif the receiving router has the advertising router

as nextR for the route it adopts the alternative route regardless of its metric

Normally a RIP packet contains information that reects the advertising

routers own routing table This rule has an exception to orouters do not ad

vertise routes on the interfaces through which they had b een learned Precisely

if a route is learned over the interface i it should b e advertised on that inter

face with hops set to innity This rule is called split horizon with poisoned

reverse and its purp ose is to prevent creation of small routing lo ops

Each routing table entry has a timer expire asso ciated with it Every time

an entry is up dated or created expire is reset to seconds Routers try

to advertise every seconds but due to network failures and congestion some

advertisements may not get through If a route has not b een refreshed for

seconds the router will assume that there was a link failure the destination

will b e marked as unreachable and a sp ecial garbageCollect timer will b e set to

seconds If this timer expires b efore the entry gets up dated the route is

expunged from the table

Standard vs Theory

The mathematical theory b ehind RIP is describ ed in as the Asynchronous

Distributed BellmanFord Algorithm ADBF In the ADBF mo del at every

p oint in time a router is either idle sending an advertisement or receiving an

advertisement The routing table is up dated up on receiving an advertisement

Details of the pro of that ADBF nds shortest routes are presented in

An interesting question is Can we use essentially the same pro of to show

that RIP proto col converges to the set of shortest routes It turns out that

the answer is quite certainly no Although motivated by the ADBF RIP stan

dard diers from it in several imp ortant details

ADBF has more p owerful b o okkeeping In RIP routers keep trackofonly

one current b est route to each destination On the other hand ADBF

no des keep for each destination the most recent routes through each of

the neighbors Corresp ondingly this would be reected in the pseudo co de

App endices A A by all subscript indices b ecoming destneighbor

instead of just dest This makes ADBF more exible which comes at the

exp ense of maintaining a larger data structure

RIP has blind up dates As a consequence of the previous dierence RIP

routers need to separately handle the case when an advertisement is received

from a neighb or whichisalreadynextR for the route In this case the receiv

ing router can do nothing b etter than blindly accept the advertised route

regardless of its length ADBF do es not have this sp ecial case

RIPs route length is b ounded RIP can handle routes of at most hops

Distances of or more hops are all considered equivalenttoinnityThis

is a practical optimization intended to balance the tradeo b etween quicker

lo op elimination and greater range for routing information propagation

RIP has the split horizon with poisoned reverse rule This is another engi

neering optimization not present in ADBF

The rst of the ab ove gaps alone would be enough to make pro ofs of conver

gence requirements for RIP substantially dierent from pro ofs for ADBF Besides

matching the RIP setting closely our pro of technique also gives useful insights

ab out the sp eed of propagation of up dates which can b e used for establishing

timing b ounds for convergence

Formal Sp ecication of RIP

In the previous section wegave a short description of the RIP standard along

with its pseudoco de In this section we present a formal sp ecication of the

proto col that can b e analyzed by HOL and SPIN First we make some sim

plications of the proto col

We observe that RIP App endices A A op erates indep endently for every

destination with no interaction b etween the state or events asso ciated with

dierent destinations This means that we need to sp ecify and prove the

proto col only for a single destination and the result will hold for the general

version as well

We only analyze the proto col in between top ology changes When the proto

col starts it mayhave any sound state to b egin with However once it has

started one must give the proto col a reasonable p erio d of time to converge

So we assume that there are no top ology changes in the lifetime of the anal

ysis Under this assumption the proto col indeed converges as we show in

Section Moreover in Section we precisely characterize the time p erio d

for which there must b e no top ology changes to guarantee convergence

We abstract awayfrom actual timing constraints If top ology changes are

ruled out routes cannot b e expired expire or deleted garbageCol lect

dest dest

So the only timing constraint left is the time interval b etween p erio dic broad

casts of advertisements We mo del this by a enabling a router to broadcast

advertisements at any time a safe abstraction and b adding a fairness

assumption to the broadcast sequence

We next sp ecify RIP in HOL for analysis in the HOL theorem prover Then

we mo del the proto col in Promela the sp ecication language for the SPIN mo del

checker The Promela mo deling is straightforward it simply involves rewriting

the pseudoco de in terms of Promelas Clikesyntax and SPINs eventsemantics

The HOL sp ecication is more involved since we need to transform the pseudo

co de into a functional sp ecication

RIP in HOL

For a RIP router the universe U is a bipartite connected graph whose no des

are partitioned into networks and routers connected through interfaces Routers

are always connected to at least twonetworks We sp ecify networks and routers

using distinct uninterpreted typ e variables network router Now any sp ecic

universe can b e describ ed simply by a function conn router network bool

that describ es the interfaceswhich routers and networks are connected with

each other A function conn describ es a valid universe U if a conn connects

each router to at least networks and b conn describ es a connected graph

When the RIP proto col starts op erating in a universe U itisgiven as input

avalid conn function describing the top ology of U and an initial state s The

proto col then seeks to compute paths from each router to the destination dWe

describ e the HOL sp ecication in three steps the state of the proto col along

with an initial state assumption the pro cesses that change the state and pass

messages to each other and the semantics of these pro cesses in HOL and

typical prop erties they are exp ected to satisfy

Proto col State The goal of RIP is to compute an optimal path at each router

r to the destination network d The path is describ ed by a routing entry the

number of hops to d the next router nextR along this path and the network

between r and nextR nextN RIP only computes paths of length less than

destinations more than hops away are considered unreachable

The proto col state consists of a table of the current routing entries at each

router r which we call the routing table rtable A proto col state is dened

as as a tuple s rtable whose comp onents are hops router num and

nextN router network and nextR router router In addition wewant all

proto col states to b e sound where soundness is dened as follows

Denition Soundness Aprotocol state s hops nextN nextR of a uni

verse described by a valid conn is said to be sound with respect to d if

r router conn r nextNr conn nextRr nextNr

r router hopsr

r router conn rd hopsr nextNr d nextRr r

r router conn r d hopsr nextNr d nextRr r

We stipulate that the initial state of the proto col s must b e sound Observe

that soundness really has to do with the lo cal connections at a router which

are typically congured by mechanisms external to RIP By stipulating that the

initial state is sound we require that the router is never deluded ab out its lo cal

top ology otherwise there is no guarantee that it will ever discover global path

information Put another way if the system ever gets into an unsound state

convergence cannot b e guaranteed Note however that we can only assume the

initial state to b e sound we need to prove that all succeeding states will remain

sound under RIP

Pro cesses We represent dierent event handlers in the proto col by dierent

pro cesses they typically perform dierent kinds of actions and may do so in

parallel As a result there are three kinds of pro cesses in the universe each

router r has an advertising process generating advertisements and a routing

process handling packet reception and at eachnetwork net there is a network

process p erforming broadcasts

The advertising pro cess p ersistently broadcasts route advertisements on each

of its connected networks Each such advertisement is a tuple src count

saying that the broadcasting router src knows of a path of length hop count to

the destination d Supp ose the proto col state is s hops nextN nextR then

the hop count advertised by src on net mayhave the following values

if net nextNsrc then hop count Innity

otherwise hop count hopssrc

When an advertisement is to b e broadcast on network netitishandedover

to the network pro cess for net The network pro cess executes the broadcast by

attempting to deliver the incoming advertisementto all routers rcv connected

to the netWe do not assume that the network is reliable in anywaysoitmay

not deliver the advertisementtoany router or it may deliver it to some of the

routers in an arbitrary order However wemake the following assumptions

Fairness the network cannot ignore a router forever So in any execution of

the network net if a router src sends advertisements innitely often and rcv

is another router connected to net the network pro cess must deliver srcs

advertisements to rcv innitely often

Zero DelayWe assume that if the network do es deliver an advertisement it

do es so instantaneously

We call the tuple src net rcv hop count corresp onding to the delivery of

an advertisement an advertisement event Observe that the unreliability of

the networks in conjunction with the p ersistent broadcasts of the advertise

ment pro cesses allows many p ossible sequences of advertisementevents In fact

the network and advertising pro cesses can generate every p ossible sequence of

src net rcv tuples in advertisementevents sub ject to the fairness assumption

and the fact that src and rcv must both b e connected to netThe only adver

tisement eld that dep ends on the network state is the hop count

The third pro cess in the system is the routing pro cess at reach router The

routing pro cess at router rcv reacts to incoming advertisements and up dates

the routing table entry hopsrcv nextNrcv nextRrcv at rcv Essentially

if an advertisement src hop count arriving at rcv through net is such that

hop count hopsrcv orsrc nextRrcv net nextNrcv then the rout

ing table at rcv is up dated so that hopsrcv hop count and nextNrcv

net and nextRrcv src In HOL we represent this pro cess by a state up

date function up date rtable router network router num rtable

which given a proto col state hops nextN nextR and an advertisementevent

src net rcv hop count computes the new proto col state The HOL co de for the

up date function is given for illustration in App endix A

Trace Semantics The observable b ehavior of the network and advertising pro

cesses is essentially an innite sequence of advertisementevents Therefore we

cho ose to express the semantics of these pro cesses as an event tracean in

nite sequence of tuples src net rcv representing advertisementevents Such

i i i

a trace is considered valid only if

the trace is fairr r router ijisrc r rcv r and

i i

the events are p ossibleiconn src net conn rcv net

i i i i

The hop count eld of the advertisementcanbelledinasfollows Supp ose

th

that at the i step event of the proto col the state of the proto col is s

i

hops nextN nextR then

if nextNsrc net then src sends an advertisementsrc to rcv in

i i i i i

stantaneously via net

i

otherwise src sends an advertisement src hopssrc to rcv instanta

i i i i

neously via net

i

Given an event trace the routing pro cesses react to the events and up date

the proto col state This pro duces an innite state sequence of the proto col s

i

dened as follows

s is any sound state

s up date s src net rcv hop count where the hop count eld is lled

i i i i i

i

in as describ ed ab ove

Thus the semantics of the up date pro cesses is the state sequence it can generate

for a given event trace All prop erties desired of the proto col are expressed and

proved in terms of this state sequence In particular the convergence theorem

states that given any valid event trace the states generated in the sequence

must converge to the optimal routing table

RIP in Promela

Promela is a natural sp ecication language for network proto cols In ad

dition to Clike programming constructs it supp orts nondeterminism dynamic

pro cesses and synchronousasynchronous channel communication b etween pro

cesses We translate the pseudoco de given in App endices A A into Promela

A fragment of the resulting Promela co de corresp onding to the routing pro cess

is shown in App endix A

As in the HOL sp ecication at each router wehave a routing pro cess and an

advertisement pro cess The advertisements pro cess is a simple nonterminating

while lo op that keeps sending advertisement to all its neighb oring networks The

routing pro cess waits for input advertisements and pro cesses them as b efore

Finally wehavea network pro cess for eachnetwork which simply implements

the broadcast mechanism by taking advertisements sent to the network and

transp orting them to the input buers of all the routers connected to it It is

only the network pro cesses that know the top ology of the network the routing

and advertisement pro cesses only know the networks they are directly connected

to

Once all the ab ove Promela pro cesses are in place we use SPIN to simulate

the proto col for sample top ologies to check our mo del We can also verify that

the proto col works for small top ologies A p ointworth noting is that in varying

the top ologies all weneedtochange is the enco ding of the network pro cesses

The routing and advertising pro cesses op erate ab ovethis connection layer In

eect the network pro cesses can pretend to have an arbitrary top ology and

the routingadvertisement pro cesses will not know the dierence We use this

prop erty later in our SPIN pro ofs of convergence where we fo ol a solitary up date

pro cess to b elieve it is part of a larger network

Convergence of RIP

In this section we presentaproofofconvergence for RIPWeprove that in the

absence of top ology changes RIP will nd shortest routes to the destination d

from every router inside the range of hops

Pro of Results

On the outermost level our pro of uses induction on distance from the destina

tion For each router r distance to d is dened as

if conn rd

D r

minfD s j s neighbor of r g otherwise

For k the k circle around d is the set of routers

C fr j D r k g

k

The key notion in our pro of is that of the k stability

Denition Stability For k we say that the universe is k stable if

both of the fol lowing properties hold

S Every router from the k circle has its metric set to the actual distance to

dMoreover if r is not connectedtod it has its nextR set to the rst router

on some shortest path to d

rr C hopsr D r conn r d D nextRr D r

k

S Every router outside the k circle has its hops strictly greater than k

rr C hopsr k

k

Our goal is to prove that under anyfairadvertisement trace the universe is

guaranteed to b ecome k stable for every k This pro of will b e carried out

by induction on k

Recall our stipulation that the universe starts in a sound state It is easy to

show that sound states are stable so this gives us the basis of induction

Lemma The universe U is initial ly stable

Akey prop ertyof k stability is that once it is achieved it is preserved for

everThis would not b e true if our denition of stability did not contain condition

S This condition strengthens the induction hyp othesis enough that wecanin

duct on k stability

Lemma Preservation of stability For any k if the universe U

is k stable at some point then it remains k stable after an arbitrary number of

advertisements

Lemma is easily proved in HOL using the denition of soundness Wealso

prove Lemma in HOL and it involves a signicantly larger case analysis

Progress from k stabilitytok stability happ ens graduallymore and

more routers start to conform to the conditions S and S This is whywe need an

additional more rened denition of stability which captures individual routers

rather than entire circles of routers

Denition Given a kstable universe we say that a router r at distance k

from d is k stable if it has an optimal route

hopsr D r k nextRr C

k

To prove that a k stable universe eventually b ecomes k stable it suces

to showthatevery router at distance k eventually b ecomes k stable

This is the statement of the following lemma

Lemma For any k and any router r such that D r k if the

universe U is k stable at some point and the advertisement trace is fair then r

wil l eventual ly become k stable Moreover r then remains k stable

indenitely

One of the key facts used in the pro of of this lemma is fairness of the advertise

ment trace Without fairness neighbors of r would be allowed to simply stop

advertising to r at any p oint This would keep r s routing table unchanged and

hence prevent it from ever achieving k stability

Observe that Lemma only involves one router r at a distance k from

d Starting from a k stable state we need to showthat r converges to the cor

rect value Moreover since all future states of the system are guaranteed to

be k stable Lemma r will receive advertisements from only two kinds of

neighb orsthose within the k circle and those outside it This leads us to a

nitary abstraction of the system We can then prove the lemma using SPIN

which p erforms an exhaustive state searchtoprovethat r will converge to the

rightvalue

However weneedtoprove that the nitary abstraction is prop ertypreserving

This pro of is done in HOL by induction on the length of fair advertisement

traces The abstraction pro of is the crucial link that allows us to join the HOL

and SPIN results without loss of rigor The abstraction pro of itself is rather

complex with a large case analysis However the eort is justied since the ni

tary abstraction can then b e used for multiple pro ofs with minor mo dications

This reuse can b e seen in the pro of statistics in Section Table The ab

straction pro of is represented in the table as the HOL p ortion of the second

pro of of StabilityPreservation Lemma

Finally using the fact that there are only nitely many routers we easily

derive the Progress Lemma whichproves our inductive step

Lemma Progress For any kif the universe U is k stable at some

point then U wil l eventual ly become and remain k stable indenitely

The main result ab out the convergence of RIP is a corollary of the ab ove

inductive argument

Theorem Convergence of RIP Starting from an arbitrary sound ini

tial state evolving under an arbitrary fair advertisement trace the universe U

eventual ly becomes and remains stable

Signicance of the Results

Our pro of whichwe call the radius proof diers from the one describ ed in for

the asynchronous BellmanFord algorithm Rather than inducting on estimates

for upp er and lower b ounds for distances we induct on the the radius of the

stable region around d The pro of has three attributes of interest

It states a property about the RIP protocol rather the asynchronous dis

tributed Bel lmanFord algorithm

The radius proof is more informative It shows that correctness is achieved

quickly close to the destination and more slowly further awayIt also im

plicitly estimates the number of advertisements needed to progress from

k stabilitytok stabilityWe exploit this in the next section to showa

realtime b ound on convergence

It uses a combination of theorem proving and model checking HOL is more

expressive and serves as the main platform SPIN is used to treat large case

analyses

Sharp Timing Bounds for RIP Stability

In the previous section weproved convergence for RIP conditioned on the fact

that the top ology stays unchanged for some p erio d of time We now calculate

how big that p erio d of time must b e To do this we need to have some knowledge

ab out the times at which certain proto col events happ en In the case of RIPwe

use a single reliability assumption that describ es the frequency of advertisements

Fundamental Timing Assumption There is a value such that during

every top ologystable time interval of the length eachroutergetsatleast

one advertisement from eachofitsneighbors

RIP routers normally try to advertise every seconds However b ecause of

congestion or some other condition some packets may not go through This is

why the standard prescrib es that a failure to receivean advertisement within

seconds is treated as a link failure Thus seconds satises the

Fundamental Timing Assumption for RIP Notice that the assumption implies

fairness of the advertisement trace

As b efore we concentrate on a particular destination d Our timing analysis

is based on the notion of weak k stability

Denition Weak stability For k wesay that the universe U

is weakly k stable if al l of the fol lowing conditions hold

WS U is k stable

WS rDr k r iskstabl e hopsr k

WS rDr k hopsr k

Weak k stability is stronger than k stabilitybutweaker than k stability

The second disjunct in WS is what distinguishes it from the ordinary k stability

Similarly as b efore wehave the preservation lemma

Lemma Preservation of weak stability For any k if the

universe U is weakly k stable at some time t then it is weakly k stable at any

time t t

This lemma and all of the subsequent results in this section are stated using real

time This is p ossible b ecause of the Fundamental Timing Assumption which

provides a connection between discrete advertisement events and continuous

time Precisely to show that some prop erty P holds after a time intervalitis

enough to provethat P holds after each router receives at least one advertisement

from each of its neighb ors

Nowweshow that the initial state inevitably b ecomes weakly stable after

RIP packets have b een exchanged b etween every pair of neighbors

Lemma Initial progress If the universe U is in a sound state and the

topology does not change U becomes weakly stable after time

The main progress prop ertysays that it takes one interval to get from a

weakly k stable state to a weakly k stable state This prop ertyis shown

in two steps First we show that condition WS for weak k stabilityholds

after

Lemma For any k if the universe is weakly k stable at some time

tthenitisk stable at time t

Then weshow the same for conditions WS and WS The following puts b oth

steps together

Lemma Progress For any kif the universe is weakly k stable

at some time tthenitisweakly k stable at time t

Lemmas and are proved in SPIN Lemma is contained in Lemma

The technique for doing the pro ofs in SPIN is the same as in the previous section

We nd a nitary abstraction of the system starting from the time when the

universe if weakly k stable This abstraction allows us to prove the Lemmas in

SPIN for a single router

The radius of the universe around d is the maximum distance from d

R maxfD r j r is a routerg

The main theorem describ es convergence time for a destination in terms of its

radius

Theorem RIP convergence time A sound universe of radius R becomes

stable within maxfRg time assuming that there were no topology

changes during that time interval

The theorem is an easy corollary of the preceding lemmas and is proved in HOL

Consider a universe of radius R Toshow that it converges in R time

observe what happ ens during each interval of time

after weakly stable by Lemma

after weakly stable by Lemma

after weakly stable by Lemma

after R weakly R stable by Lemma

after R R stable by Lemma

R stability means that all the routers that are not more than R hops awayfrom d

will have shortest routes to d Since the radius of the universe is R this includes

al l routers

An interesting observation is that progress from k stabilitytok stability

is not guaranteed to happ en in less than time weleave this to the reader

Consequentlyhadwechosen to calculate convergence time using stability rather

than weak stability we would get a worse upp er bound of R In

fact our upp er b ound is sharp in a linear top ology up date messages can be

interleaved in suchaway that convergence time b ecomes as bad as R

Figure shows an example that consists of k routers and has the radius k

with resp ect to d Router r is connected to d and has the correct metric Router

r also has the correct metric but p oints in the wrong direction Other routers

have no route to d In this state r will ignore a message from r b ecause that

route is no b etter than what r thinks it already has However after receiving

a message from r to which it p oints r will up date its metric to and lose the

route Supp ose that from this p oint on messages are interleaved in suchaway

that during every up date interval all routers rst send their up date messages and

then receive up date messages from their neighb ors This will cause exactly one

new router to discover the shortest route during every up date interval Router

r will havethe route after the second interval r after the third and r

k

after the k th This shows that our upp er b ound of k is sharp r1 r2 r3 rk d 1 2 16 16 r1 r2 r3 rk d 11616 16 r1 r2 r3 rk d 116162 r1 r2 r3 rk d 1162 3

r1 r2 r3 rk d

1 2 3k

Fig Maximum Convergence Time

Analysis of Metho dology

SPIN is extremely helpful for proving prop erties such as Lemma whichinvolve

tedious case analysis To illustrate this assuming weak k stability at time t

consider what it takes to show that condition WS for weak k stability

holds after time WS will hold b ecause of Lemma but further eort is

required for WS Toprove WS let r b e a router with D r k Because

of weak k stability at the time t there are two p ossibilities for r r has a k

stable neighb or or all of the neighb ors of r have hops kToshow that r will

eventually progress into either a k stable state or a state with hops k

we need to further break the case into three sub cases with resp ect to the

prop erties of the router that r points to a r points to s C the kcircle

k

which is the only neighbor of r from C or b r points to s C butr has

k k

another neighbor t C such that t s or c r points to s C Eachof

k k

these cases branches into several further sub cases based on the relative ordering

in which r s and p ossibly t send and receive up date messages

Doing such pro ofs by hand is dicult and prone to errors Essentiallythe

pro of is a deeplynested case analysis in which nal cases are straightforward

to provean ideal job for a to do Our SPIN verication is divided

into four parts accounting for dierent kinds of top ologies Each part has a

distinguished pro cess representing r and another pro cesses mo deling the envi

ronment for r An environment is an abstraction of the rest of the universe

It generates all message sequences that could p ossibly b e observed by r SPIN

considered more cases than a manual pro of would have required of them

altogether for Lemma but it checked these in only seconds of CPU time

Even counting setup time for this verication this was a signicant timesaver

The resulting pro of is probably also more reliable than a manual one

Table summarizes some of our exp erience with the complexity of the pro ofs

in terms of our automated supp ort to ols The complexityofanHOLverication

for the human verier is describ ed with the following statistics measuring things

Table Proto col Verication Eort on RIP Convergence

Task HOL SPIN

Mo deling RIP lines defs lemmas lines

Stability Preservation Once lemmas cases steps

Stability Preservation Again lemmas cases steps lines states

Stability Progress Reuse Stability Preservation lines states

Weak Stability Preservation Reuse Stability Preservation lines states

Initial Weak Stability Reuse Stability Preservation lines states

Weak Stability Progress Reuse Stability Preservation lines states

written by a human the number of lines of HOL co de the number of lem

mas and denitions and the number of proof steps Pro of steps were measured

as the number of instances of the HOL construct THEN The HOL automated

contribution is measured by the number of cases discovered and managed by

HOL This is measured bythenumber of THENLs weighted bythe number of

elements in their argument lists The complexity of SPIN verication for the

human verier is measured bythenumber of lines of Promela co de written The

SPIN automated contribution is measured by the number of states examined

and the amount of memory used in the verication In our investigations we

have found that SPIN is generally memory b ound that is it runs out of mem

ory in a relatively short p erio d of time if the state space it must searchistoo

large For our nal RIP pro ofs however eachoftheverications to ok less than

a minute and the time is generally prop ortional to the memory Most of the lem

mas consumed the SPINminimum of MB of memory some required more

The gures were collected for runs on a lightlyloaded Sun Ultra Enterprise with

MB of memory and CPUs running SunOS The to ol versions used

were HOL and SPIN We carried out parallel pro ofs of Lemma the

Stability Preservation Lemma using HOL only and HOL together with SPIN

It is imp ortanttoobserve that the SPIN gures were derived from nal runs

The typical pro cess was as follows attempt to provea result with SPIN nd

that it is to o costly apply an abstraction that was proved in HOL and try the

SPIN pro of again on the abstracted problem which presumably has a smaller

set of cases to check This was rep eated until wewere happywiththe size of

the SPIN state space and the clarity of the abstractions This use of SPIN was

worthwhile even if the pro of was eventually carried out entirely in HOL since

SPIN provided a quickway to debug our lemmas We exp erimentedwiththe

question of whether to stop with a mixed HOLSPIN pro of or complete the entire

pro of in HOL A pro of entirely in HOL arguably provides more condence since

the relationship b etween the HOL and SPIN parts of a pro of are treated manually

in our studyWeproved stability preservation twice once using HOLSPIN and

again using only HOL Table indicates some asso ciated statistics showing that

the complexity of the HOL pro of dropp ed by ab out at the cost of writing

lines of SPIN co de In future work wemay attempt to measure programmer

months since this would provide a more complete indication of scalability

Related Work

Combining mo del checking with theorem proving has long b een recognized as a

very promising direction in eective formal metho ds There are primarily two

ways in which the metho dologies can b e combined Systems like PVS use

mo del checking as a decision pro cedure to solve nitary subgoals in a deduc

tive pro of On the other hand mo del checking can b e used to prove a nitary

abstraction of a system where the soundness of the abstraction can b e proved

in a theorem prover We use the latter metho dology for our pro ofswe

carry out our induction and abstraction pro ofs in HOL while the induction

step is proved for a nitary abstraction of the system in SPIN

In recentyears a variety of proto col standards have b een formally veried

Notable success has b een achieved in verifying cache coherence proto cols bus

proto cols and endp oint communication proto cols In the domain of routing

proto cols there has b een work on verifying ATM routing proto cols where the

authors use SPIN to verify the absence of deadlo ck of the routing proto col for a

few xed congurations An instance verication of an ActiveNetwork routing

proto col has been carried out in Maude Formal testing supp ort has b een

develop ed for routing proto cols Other work has b een in the form

of manual pro ofs of key safety prop erties

Conclusion

This pap er provides the most extensive automated mathematical analysis of an

internet routing proto col to date Our results show that it is p ossible to provide

formal analysis of correctness for routing proto cols from IETF standards and

drafts with reasonable eort and sp eed thus demonstrating that these techniques

can eectively supplement other means of improving assurance suchasmanual

pro of simulation and testing Sp ecic technical contributions include the rst

pro ofs of the convergence of the RIP standard and a sharp realtime b ound for

this convergence We have also gained insight into strategies for combining a

higherorder theorem prover such as HOL with a mo del checker such as SPIN in

a unied metho dology that leverages the expressiveness of the theorem prover

and the high level of automation of the mo del checker to provide an ecientbut

highcondence analysis

Acknowledgments

Wewould like to thank the following p eople for their assistance and encourage

ment Ro ch Guerin Elsa L Gunter Luke Hornof Sampath Kannan and Insup

Lee This researchwas supp orted by NSF Contract CCR DARPACon

tract F and AROContract DAAG

References

Dimitri P Bertsekas and Rob ert Gallager Data Networks Prentice Hall

Edmund M Clarke and Jeannette M Wing Formal Metho ds State of the Art

and Future Directions ACM Computing Surveys December

rep ort by the Working Group on Formal Metho ds for the ACM Workshop on

Strategic Directions in Computing Research

D Cypher D Lee M MartinVillalba C Prins and D Su Formal Sp ecication

Verication and Automatic Test Generation of ATM Routing Proto col PNNI In

Formal Description Techniques Protocol Specication Testing and Verication

FORTEPSTV IFIPNovemb er

JJ GarciaLunaAceves and Shree Murthy A Lo opFree PathFinding Algorithm

Sp ecication Verication and ComplexityInProceedings of IEEE INFOCOM

April

M J C Gordon and T F Melham editors Introduction to HOL A theorem

proving environment for higher order logic Cambridge University Press

Timothy G Grin and Gordon Wilfong An analysis of BGP convergence prop

erties In Guru Parulkar and Jonathan S Turner editors Proceedings of ACM

SIGCOMM Conference pages Boston August

Ahmed Helmy Deb orah Estrin and Sandep Gupta Faultoriented Test Genera

tion for Proto col Design In Formal Description Techniques

Protocol Specication Testing and Verication FORTEPSTV IFIPNovember

C Hendrick Routing information proto col RFC IETF June

Home page for the HOL interactive theorem proving system httpwwwclcam

acukResearchHVGHOL

Gerard J Holzmann Design and Validation of Computer Protocols Prentice Hall

Christian Huitema Routing in the Internet Prentice Hall

G Malkin RIP Version Carrying Additional Information IETF RFC

January

Ab del Mokkedem Ravi Hosab ettu Michael D Jones and Ganesh Gopalakrishnan

Formalization and Analysis of a Solution to the PCI Bus Transaction Ordering

Problem Formal Methods in System Design January

Olaf Muller and Tobias Nipkow Combining mo del checking and deduction for

ioautomata In Proceedings of the Workshop on Tools and Algorithms for the

Construction and Analysis of SystemsMay

Shree Murthy and JJ GarciaLunaAceves An ecient routing proto col for wire

less networks ACM Mobile Netowrks and Applications Journal Octob er

Sp ecial Issue on Routing in Mobile Communication Networks

N Shankar PVS Combining sp ecication pro of checking and mo del checking

In Mandayam Srivas and Alb ert Camilleri editors Formal Methods in Computer

Aided Design FMCAD volume of Lecture Notes in Computer Science

pages Palo Alto CA Novemb er SpringerVerlag

Home page for the SPIN mo del checker httpnetlibbell labscomn etlib

spinwhatispinhtml

BowYawWang Jose Meseguer and Carl A Gunter Sp ecication and formal ver

ication of a PLAN algorithm in Maude In Proceedings of the International work

shop on Distributed System Valdiation and Verication pages EE IEEE

Computer So ciety Press April

A Co de Samples

A Pseudo co de for RIP Declarations

pro cess RIPRouter

state

me ID of the router

interfaces Set of routers interfaces

known Set of destinations with known routes

hops Distance estimate

dest

nextR Next router on the waytodest

dest

nextIface Interface over which the route advertisementwas received

dest

timer expire Expiration timer for the route

dest

timer garbageCol lect Garbage collection timer for the route

dest

timer advertise Timer for p erio dic advertisements

events

receive RIP router dest hopCnt over iface

timeout expire

dest

timeout garbageCol lect

dest

timeout advertise

utility functions

broadcast msg iface

f Broadcast message msg to all the routers attached to the network on the other side

of interface iface

g

A Pseudo co de for RIP Event Handlers

event handlers

receive RIP router dest hopCnt over iface

f

newMetric min hopCnt

if dest known then

f

if newMetric

f

hops newMetric

dest

nextR router

dest

nextIface iface

dest

set expire to seconds

dest

known known fdest g

g

g else

f if router nextR or newMetric hops

dest dest

f

hops newMetric

dest

nextR router

dest

nextIface iface

dest

set expire to seconds

dest

if newMetric then

f

set garbageCol lect to seconds

dest

g else

f

deactivate garbageCol lect

dest

g ggg

timeout expire

dest

f hops

dest

set garbageCol lect to seconds

dest

g

timeout garbageCol lect dest

f known known fdest g

g

timeout advertise

f

for each dest known do

for each i interfaces do

f

if i nextIface then

dest

f

broadcast RIP me dest hops i

dest

g else

f

broadcast RIP me dest i Split horizon with p oisoned reverse

g

g

set advertise to seconds

g

A HOL Co de for Up date Function

val updateDEF newdefinition

update

srcrouter netnetwork rcvrouter hopcountnum

hopsrouternum nextNrouternetwork nextRrouterrouter

update hopsnextNnextR srcnetrcvhopcount

let nhnnnr

nextRrcvsrc nextNrcvnet

SUC hopcountnetsrc

SUC hopcount hopsrcv

SUC hopcountnetsrc

hopsrcvnextNrcvnextRrcv

in rrouterrrcv nh hopsr

rrouterrrcv nn nextNr

rrouterrrcv nr nextRr

A Promela fragmentforRouting Pro cess

proctype Updaterouter ME

mesg adv

chan in routerinputME

do

atomicinadv

if

advsrc rtableMEnextR

advnet rtableMEnextN

if

advhopcount INFINITY

rtableMEhops INFINITY

advhopcount INFINITY

rtableMEhops advhopcount

fi

advhopcount rtableMEhops

rtableMEnextR advsrc

rtableMEnextN advnet

rtableMEhops advhopcount

else skip

fi

od