Validation of Bosch' Mobile Communication Network

Architecture with Spin

Theo . Ruys and Rom Langerak

Faculty of Computer Science, UniversityofTwente.

P.O. Box 217, 7500 AE Enschede, The Netherlands.

fruys,[email protected]

March 14, 1997

Abstract

This pap er discusses validation pro jects carried out for the Mobile Communication

Division of Rob ert Bosch GmbH. We veri ed parts of their Mobile Communication

Network (MCNet), a communication system which is to be used in infotainment

systems of future cars. The proto cols of the MCNet have b een mo delled in Promela

and validated with Spin. Apart from the validation results, this pap er discusses some

observations and recommendations of the use of Promela and Spin.

Keywords proto col, validation, veri cation, mo del checking, , Spin, Promela

1 Intro duction

This pap er discusses the validation of parts of the Mobile Communication Network (MCNet),

which has b een develop ed at the Mobile Communication Division of Rob ert BoschGmbH. MC-

Net is a communication system whichistobeusedin infotainment systems [14] of future cars.

Infotainment systems encompass entertainment systems like conventional car radios equipp ed with

CD changers and driver information systems likenavigation units, (mobile) telephones, etc.

New technologies like RDS Plus (Radio Data System), TIM (Trac Information Memory),

DAB (Digital Audio Broadcasting), etc. put high requirements on infotainment systems, which

will develop into complex, integrated systems. Communication b etween the various comp onents

of such systems will b ecome increasingly imp ortant.

As an example, supp ose one is driving in such a mo dern car. Assume that while listening to a

CD the RDS system receives information ab out a trac jam on the route that was laid out by the

navigation unit. The CD should b e immediately interrupted to transmit the radio announcement

and furthermore, the navigation unit should nd an alternative route. But what should happ en if

at the same time the mobile phone rings? Should the radio announcementbeinterrupted? And

if so, should the announcement be saved? It should be clear that the communication network

underlying the complete infotainment system should b e highly reliable.

1

In the last twoyears, our group has b een involved in the validation of the MCNet Architec-

ture. Wehave mo delled parts of successiveversions of MCNet in Promela [7]andwehave used

2

Spin to simulate and verify our mo dels.

1

The Formal Metho ds and To ols group of the Department of Computer Science of the UniversityofTwente.

2

In this pap er use the term Spin when discussing the simulation and validation of Promela mo dels. In most cases,

however, we used the graphical interface Xspin, which resides on top of Spin, to manipulate the Promela mo dels. 1

With Spin wewere able to pinp oint several caveats in earlier versions of the MCNet (i.e. [17]

and [18]). Bosch adopted most of our recommendations to correct and improve the reliabilityof

the MCNet proto cols. We did not nd any serious errors in the last version of MCNet [19] that

wevalidated.

The structure of the rest of this pap er is as follows. In section 2, an intro duction to the

MCNet Architecture is given. Our validation e orts are summarized in section 3. In section 4,

some exp eriences with Promela and Spin are discussed. The pap er is concluded in section 5, where

some conclusions and recommendations are presented.

2 The MCNet Architecture

3

The Mobile Communication Network (MCNet) is develop ed at the Mobile Communication Divi-

sion of the Bosch Blaupunkt Group, a sub division of Rob ert BoschGmbH (abbreviated to Bosch).

Bosch is a German company, which is well known as a supplier of comp onents and subsystems

to the automative industry. Other areas of business of Bosch include communication technology,

consumer go o ds and capital go o ds.

[19] sp eci es the goal of MCNet: \The aim of MCNet is to sp ecify a reliable system capable

of transferring driver information and control data. It was not designed for exchanging audio or

video data." MCNet is prop osed as a company-wide standarization within Bosch. Later, the

MCNet sp eci cation will b e o ered to interested third parties to reach a more widespread usage

4

[19]. For this reason, muchwork has b een carried out within the OSEK group. The main result of

this collab oration is that the MCNet's Transp ort proto col is in conformance with the up coming

OSEK standard.

The MCNet will b e built on top of a Controller Area Network (CAN) [3], which is a serial data

communications bus system esp ecially suited for networking `intelligent' I/O devices. Features of

the CAN include a multi-master proto col, real-time capabilities, error correction and high-noise

immunity. The CAN serial bus system, originally develop ed by Bosch to reduce the need for

massive wiring harnesses in automobiles, is increasingly b eing used in industrial automation.

CANisaninternational standard and is do cumented in ISO-11898 [3] (for high-sp eed appli-

cations) and in ISO-11519 [1, 2] (for lower-sp eed applications). Gentle intro ductions to CAN can

b e found on the Internet, for example [8] and [21].

2.1 Layered Architecture

Because of similar demands, the MCNet resembles the layered architecture of ISO/IEC's standard

on Op en System Interconnection (OSI) [4]. Using the standard pro cedures and proto cols of the

OSI architecture as a basis for MCNet has the following b ene ts [19]:

separation of concerns: applications and communications are cleanly separated from each other;

reusability: not only abstract concepts, but also (existing) software and hardware may b e reused;

interop erability: individual layers can b e mo di ed or replaced in keeping with changing require-

ments (as long as internal interfaces need not b e mo di ed);

testability: the development, manufacturing and diagnosis of the network is more easily tested.

Figure 1 (from [19]) shows the corresp ondence b etween MCNet and the OSI Layers.

3

The MCNet was formerly called Mobile Communication Architecture (MCA).

4

OSEK is an abbreviation for the German \O ene Systeme und deren Schnittstellen fur di Elektronik im

Kraftfahrzeug" (Op en systems and the corresp onding interfaces for automotive electronics) and is the international

standardization group of car manufacturers and suppliers. It aims at an industry standard for an op en-ended

architecture for distributed control units in vehicles. The parties within OSEK are: Op el, BMW, Daimler-Benz,

Mercedes-Benz, Rob ert Bosch, Siemens, Volkswagen and the University of Karlsruhe. 2 ] t w for the the trol [18 rate ytes only , Data ] to tly een bit 18 [17], w (TPDUs) , The (a in tities. bet managemen [17 Curren er with con sliding-windo ted y Units conforms a sp eed The er ers. w as w sp eed. terface y y telegrams. lo Data in MCNet er 4) and is resp onsible 1). La y a streams. ertical' la of the een proto col en CAN ell do cumen bit only w w as sp eci ed Figure wledged data transfer (7 b ransfer t, Proto col wing asp ects: T is t. es ersions and kno v er (see and the OSI la y serv receiving receiving , La ers er of the OSI (la ransfer ted ac y y T MCNet la and and Previous of er o ers three data services to the Adaption t)' proto col. MCNet wledged data transfer service for transp orta- y k ed as an additional `v The ransfer are. MCNet k. T w kno een 3 of blo c ts discuss the follo w sending blo c sending ransp ort La the unication a er ytes of application data); for y ted ac ], proto col are de ned length. wledgemen the ransfer La for la error-failure managemen y [19 send wledged broadcast data transfer service. of comm kno an t It can b e view The T kno of can tal' is not a part of the OSI reference mo del, but instead is commonly highest MCNet y t ts extended resp onsible are a subset of those de ned in [2] for CAN lo er, the do cumen unac exp edited connection-orien MCNet tit ork and (lo cal) station managemen y resp onsible the is w e: en e: Corresp ondence b et of , is `horizon and wledgemen connection-orien er corresp onds to the T messages ait (for the ac er 1: y er y e: kno the er y of MCNet La are and the application soft y ac La ersion w do cumen La of v proto col

Management Services ersions of the er also supp orts services to test the connection b et t h proto col la h the y a reduction Figure ev tation Link ork Managemen d Data Servic eac ast Data Servic w ysical cost or eac curren to dite Application Software aiting F ransfer La e where ]: adc Ph Adaption Data LinkLayer Physical Layer Data of o w T Net segmen a ong Data Servic De ning the ransfer La unication soft or less of application data); Br L tion of `long messages' (more than 7 b Exp MCNet consecutiv w 125 Kbit/s) is applied. the er [19 The The The  In  The The  y terfaces Transfer Layer Adaption Layer for reasons Link services used in belo La up coming standard of OSEK v2.0. The T in The used in the area of eldbus systems. comm supp orted a `send and w b efore proto col services encompass b oth net 2.2 and [19]. Device Driver CAN v2.0A Data Link Transport Session Presentation Physical Network Application OSI Layers 3 4 2 5 6 7 1

Service The service de nitions o ered by the particular proto col are presented: the service prim-

itives together with their parameters are listed.

PDU If applicable, the Proto col Data Unit (PDU) structure of the proto col is de ned.

Proto col The proto col itself is formally de ned through state transition tables.

Time sequence diagrams Some of the proto cols are illustrated with time sequence diagrams

describing particular scenarios.

Although the service primitives supp orted by a particular proto col are well de ned, the desiredbe-

haviour of the proto col is unclear from the service primitives alone; the (formal) relations b etween

the primitives is missing. For example, in the Transfer Layer, it is unclear what should happ en if

a connection request is issued at a service access p ointofaTransfer proto col entity. Should this

(always) result in a connection indication at the other side?

Relations b etween the service primitives can b e deduced from the state transition tables. Such

relations, however, are constraints on this particular design of the proto col. They do not de ne

the `desired b ehaviour' of the proto col as a blackbox.

Summarizing, the design of the MCNet proto cols is well de ned and well do cumented, but

the service description of the proto cols is missing.

3 Validation

In our validation e orts, wehave concentrated our attention on the Transfer Layer of MCNet. We

were mostly interested in the connection setup phase, the data transfer phase and the connection

termination phase of this Transfer proto col. No distinction was made between the Long Data

Service and the Exp edited Data Service, b ecause the proto col b ehaves exactly the same for b oth

services. We did not incorp orate the `Testing Services' and the `Broadcast Data Transfer Services'

into our validation. In our last twovalidations, we also lo oked closely at the Network Management

Proto col of the MCNet.

In the mo delling phase, one of our goals was to come up with Promela mo dels of the proto cols,

whichwere as close to the original `co de' of the state transition tables as p ossible. Although this

mayhave resulted in less ecient Promela constructs, we found twoadvantages:

 We got more con dence in the Promela mo del of the particular proto col.

 People familiar with the state transition table (namely the engineers at Bosch)were able to

read and understand the Promela mo dels.

The rsttwovalidations were carried out by Langerak [12,13], while the last validation was

carried out by Ruys [20]. Toget two indep endent views of the proto cols, Ruys did not use the

mo delling results of the rst twovalidations. The reason for this was that the authors wanted to

nd di erences and similarities in the mo delling approaches. But more imp ortantly, the authors

hop ed to nd caveats in the proto cols, which had not b een found in the rst twovalidation e orts.

3.1 Problems encountered

As wesaidabove, although the protocol layers of the MCNet are extensively do cumentedin[17],

[18] and [19], the intended, desired b ehaviour (i.e. the service) of the various layers is missing.

This had the following consequences for our work:

 Because the `rationale' b ehind the proto cols was missing, the mo delling of the proto cols

(esp ecially the Network Management Proto col) was time consuming. Deriving the intended

service by studying transition tables b ecame reverse engineering. 4

 If a proto col do esn't contain any obvious errors (e.g. deadlo cks), the proto col should be

validated against its service description. If such a service description is missing, it is dicult

to come up with prop erties to b e checked with Spin.

Another `do cumentation problem' was caused by the successive, di erentversions of the MCNet

standard. It proved to b e error-prone and time consuming to compare consecutiveversions to nd

the di erences.

We also encountered an obvious problem when mo delling the Network Management Proto col

(NMP) in Promela. The NMP has connections to all horizontal layers (see Figure 1). Tomodel

the proto col, we had to include all relevant parts of all these layers.

3.2 Results

In [12], Langerak validated the Transfer Layer of version 0.4 of MCNet. With the help of Spin,

he found several caveats in the proto col. For example, he found that:

 when two proto col entities wanted to setup a connection at the same time, this could lead

to a deadlo ck; and

 a single acknowledgementTPDU was used for both the connection acknowledgement and

the data acknowledgement. This could also lead to errors.

Bosch adopted the recommendations of [12] and in version 0.7 of MCNet, these errors were

corrected. This new version of the proto col, however, contained some minor new errors, which

again were discovered with Spin [13]:

 With resp ect to version 0.4, some - on rst sight - redundant transitions had b een removed

from the proto col. Spin showed that these transitions were really needed, though.

 The proto col contained unsp eci ed receipts that had to b e dealt with in an implicit way.

Langerak [13] also mo delled the Network Management Proto col (NMP) of MCNet version 0.7.

As said b efore, the main problem with mo delling the NMP is that without a service description

it is very dicult to deduce the (desired) behaviour of the proto col. Apart from checking for

deadlo cks, no signi cant prop erties were checked.

In our last validation e ort, welooked closely at version 0.8 of the OSEK Transp ort Proto col

[5]. In future versions of MCNet, the OSEK Transp ort Proto col will replace the Transfer Layer

in MCNet.

Unlike the `send-and-wait' proto cols of MCNet version 0.4 and MCNet version 0.7, OSEK's

Transp ort Proto col includes a (sliding-window) blo ck transfer mo de. We therefore fo cussed our

attention on prop erties sp eci c to the blo ck transfer mo de of this proto col. Again, we did nd

some errors with Spin. For example:

 The proto col sp eci cation of [5]was incomplete. For example, it was not clear howtheblock

size of a connection is negotiated b etween proto col entities. Therefore it was not certain that

the entities of a connection would all use the same blo ck size.

 The proto col description hints on a sp eci c way to implement(andthus, mo del) `full-duplex'

communication. When adopting this particular scheme in Promela,however, weencountered

several problems. These problems are discussed in detail in section 4.3.

In [20]we also mo delled the NMP proto col of MCNet version 1.0. Again, it proved to o dicult

to really validate this proto col as long as the desired b ehaviour of the proto col is not completely

sp eci ed.

As we said earlier, to get di erent views on the proto cols, we did not use the mo delling results

of the rst twovalidations in our last e ort. It was remarkable to see that the indep endent mo dels

have much in common. The new, indep endent lo ok at the proto cols did not discover any new

errors in the proto cols.

Summarizing, with the help of Spin,wewereableto ndseveral caveats in the MCNet proto cols. 5

4 Exp eriences with Spin and Promela

This section discusses some exp eriences with Spin and some mo delling decisions in greater depth.

4.1 Validation tra jectory

Our validation e ort with Promela and Spin can b e summarized as \with every step, more carefully

dosed misery". To illustrate this, we presentthevalidation tra jectory we used, in the following

pseudo algorithm.

hModel ling and validation in Promela i

hStart with abstract model i

hSimulate and verify i

while not hconvincedbymodel i

do

if

:: hAdd details to model i

:: hIntroduceerrors into the environment i

fi

hSimulate and verify i

od

In hStart with abstract model i we started with a very abstract Promela mo del of the proto col. In

Promela, one also has to sp ecify the environment of the proto col, which consists of the user of the

proto col (i.e. the upp er layer service) and the underlying (lower layer) service of the proto col (e.g.

physical lines). In the b eginning, we mo delled a correctly b ehaving environment.

Wehave b orrowed Promela's if statementtocho ose b etween hAdd details :::i and hIntroduce

errors :::i. In hAdd details to model i more details are added to the mo del of the proto col. This may

intro duce errors, which will b e discovered in the next hSimulate and verify i step. This sometimes

meant \one step forward, two steps back".

In hIntroduce errors into the environment i, the carefully dosed `misery' is intro duced. For the

proto col to b e robust, it should deal correctly with errors in the environment. For example, errors

in the underlying service will result in messages that may b e lost in transit. Erroneous ordering

of (user) service primitives, on the other hand, may lead to unsp eci ed receipts in the proto col,

and hence deadlo ck.

The algorithm that weusedtosimulate and verify is shown b elow.

hSimulate and verify i

SIM: while not hconvinced by simulation i

do

hSimulate the model with Spin i

if herrors found i then

hCorrect errors in the model i

goto SIM

fi

od

while not hconvinced by veri cation i

do

hVerify the model with Spin i

if herrors found i then

hCorrect errors in the model i

goto SIM

fi

od

Most of the hard work of hSimulate and verify i is done by Spin. When errors are found, the errors

should b e corrected, and the simulation and veri cation step should b e started again. 6

When correcting errors (or adding details to the mo del) in Promela, one sometimes has to

depart slightly from the mo del and thus p ossibly intro duce (new) errors. As an illustration of this

phenomenon, see section 4.3.

The predicates hconvinced by :::i are determined by the knowledge, exp erience and intuition

of the validation engineer (i.e. the user of Spin) and dep end on:

 the number of details in the currentmodel;

 the numb er and kind of properties checked so far;

 all previous simulation and veri cation runs for this particular mo del;

 the mo dels of the proto col, whichhave b een validated in earlier steps.

4.2 Mo delling timeouts

The MCNet (Blo ck) Transfer Proto col uses several timers. First of all, when setting up a con-

nection, a timer is used to wait for the connection acknowledgement TPDU. Similarly, a sending

entity uses a timer to wait for a data acknowledgement TPDU. Furthermore, in the blo ck transfer

version of the proto col, when sending consecutive data TPDUs, the sender waits a certain amount

of time b etween the TPDUs to give the receiver time to deal with the incoming TPDUs.

It should b e noted that only the sending party of a data connection uses a timer. This means

that when the sending party decides to terminate the connection (b ecause it didn't receive an

acknowledgement for a long time), the receiving party will not notice the disconnection and keeps

waiting for data TPDUs. Although this may seem to b e a problem for the Transfer Proto col, in

the MCNet Architecture, this is solved by the NMP proto col, whichchecks that connections are

still alive.

In our mo del of the (Blo ck) Transfer Proto col, a deamon pro cess has b een intro duced that

5

may steal TPDUs from the underlying physical channel:

7 hproctypes 7i (11.1)

proctype Deamon()

{

TPDU tpdu ;

do

:: pchan[0]?tpdu

:: pchan[1]?tpdu

od

}

where channel pchan[i] mo dels the incoming line at proto col entity i.

Promela provides the timeout statement to mo del system-wide timeouts: if no other statement

in the global system is executable, the timeout statement b ecomes executable. So in our mo del

of the Transfer Proto col this would b e:

5

Please ignore the hproctypes i de nition; this notation will b e explained in section 4.4. 7

#define AckTimerExpired timeout

The global timeout statement has two disadvantages:

 The timeout statement cannot be used to check for erroneous deadlo cks in the system;

in situations which should b e recognized as deadlo cks, the timeout statement will still b e

executable.

 Another problem is the global nature of timeout. There are situations where a `lo cal'

timeout is demanded which b ecomes executable when only this lo cal pro cess cannot pro ceed

anymore. In this case, other pro cesses in the system would not need to b e stopp ed.

On the other extreme of global timeouts, one maywanttomodel premature timeouts. As Holzmann

[7] suggests, premature timeouts can b e mo delled by skip,whichisalways executable:

#define AckTimerExpired skip

Limiting the premature timeouts

Normally, the general timeout schemes of Promela suce. In our mo del, however, wewanted to

mo del actual timeouts and premature timeouts at the same time. Although skip will catch b oth

timeouts, it was to o coarse for use: wewanted to limit the numb er of premature timeouts. So we

used the following scheme:

#define AckTimerExpired premature_timeout || timeout

where premature_timeout is a b o olean variable that maybetrue if a premature timeout is still

p ossible and false, otherwise. Each time the proto col starts waiting for an acknowledgement,

this variable premature_timeout variable is set:

#define set_premature_timeout \

if \

:: nr_premature_timeouts <= MAX_PREMATURE_TIMEOUTS \

-> if \

:: premature_timeout = TRUE ; \

nr_premature_timeout++ \

:: premature_timeout = FALSE \

fi \

:: else -> premature_timeout = FALSE \

fi

Notifying the waiting party

If wewanttocheck for deadlo cks and assume that premature timeouts are not p ossible, we can

also adopt a di erent timeout mechanism.

When a message is lost in transit, the entitythatwaits for the acknowledgement will eventually

timeout. Instead of letting the entitywait (for its timer to expire), the waiting party is noti ed

that the message has b een lost.

A separate timeout channel timeout_ch is intro duced which is used to notify a waiting party:

chan timeout_ch = [1] of {bit} ;

#define AckTimerExpired timeout_ch?1

In order to makethiswork, a polite Deamon pro cess has to b e used which noti es the waiting party

via the timeout_ch channel. When stealing a TPDU message, the Deamon pro cess will examine

the TPDU message to decide whichentityiswaiting for an acknowledgement: 8

proctype Deamon()

{

TPDU tpdu ;

do

:: pchan[0]?tpdu -> NOTIFY(1,0)

:: pchan[1]?tpdu -> NOTIFY(0,1)

od

}

where the macro NOTIFY(i,j) takes care of the noti cation of the waiting entity when a message

is lost in the line from i to j.

A note on this last scheme of mo delling timeouts is appropriate here. Earlier, we warned

against to o much of a deviation from the (original) mo del. In fact, we would agree with the

reader who says that this `timeout channel hack' is a good example of too much deviation. On

the other hand, the polite Deamon provides more control over the timeouts. This control proved

quite valuable in nding a stubb orn error in one of our mo dels.

4.3 Mo delling the full duplex connection

In the Transfer Layer proto col, after the connection setup phase, the proto col entities pro ceed to

the so-called CONNECTED state. Transfer of data is full duplex: eachentity can send and receive

data at the same time. For simpli cation reasons, in the description of the proto col [5] and [19],

the CONNECTED state is viewed as the parent state of di erent receiver and sender substates: \the

CONNECTED state can b e mo delled as b eing comp osed of four indep endently working state machines

(i.e. sending and receiving machines for b oth the exp edited data transfer and long data transfer)".

Initially, this hint put us on the wrong track. Foralongtimewemodelledthe CONNECTED state

by starting a separate Sender and Receiver pro cess for eachentity:

CONNECTED:

...

atomic {

run Sender(i, dest) ;

run Receiver(i, dest) ;

}

Using separate Sender and Receiver pro cesses in each proto col entity seems a natural and elegant

way to mo del the full duplex data connection, but it turns out to haveseveral disadvantages. For

example, the `parent' pro cess uses variables that are used by b oth the Sender and Receiver.

Therefore, in Promela these variables had to b e de ned globally. Furthermore, instead of having

just a single proctype for each proto col entity, three proctypes are needed for eachentity. This

adds to the complexityofthe Promela mo del.

But the most serious disadvantage is the following. During normal op eration, the three pro-

cesses (i.e. the parent pro cess and Sender and Receiver pro cesses) can work indep endently. But

in the case of a reset or termination of the connection, all three pro cesses have to be synchro-

nized. This requires non-trivial synchronization schemes b etween the three pro cesses which are

not present in the original description of the proto col [5]. For example, we initially mo delled part

of the Receiver pro cess as follows: 9

do

:: Receive(DT) -> /* normal operation */

od

unless (disconnect_now || reset_now) ;

if

:: disconnect_now -> /* handle disconnect */

:: reset_now -> /* handle reset */

fi

where the variables disconnect_now and reset_now are variables that are set by the Sender

pro cess and parent pro cess, when the need for a termination or reset of the connection arises. Using

this scheme, serious deviations from the original mo del had to b e intro duced. As a result, errors

were found whichwhereintro duced by the `new' synchronization scheme. But more imp ortantly,

errors in the original description mayhave b een camou aged by the new synchronization scheme.

Tosolve these problems, we used a single proctype (instead of three) to de ne the b ehaviour of

a proto col entity. Weusedtwo state variables to hold the `lo cal' state of the sending and receiving

`subpro cess': s_state and r_state. We illustrate this b elow:

do

...

:: (r_state == RECEIVING) && Receive(DT) -> /* process DT */

:: (s_state == WAIT_AK) && Receive(AK) -> /* process AK */

:: (s_state == WAIT_AK) && AckTimerExpired -> /* deal with timeout */

...

od ;

Handco ding the substates of the sender and receiver is not very elegant. Furthermore, the resulting

Promela mo del `lo oks' di erent compared with the proto col description as de ned in [5] and [19].

In this case, however, it turned out to b e a b etter and more reliable mo del than using the separate

Sender and Receiver pro cesses.

4.4 Literate mo delling

Literate programming is the act of writing computer programs primarily as do cuments to be

read by human b eings, and only secondarily as instructions to be executed by computers [16].

In general, literate programs combine source co de and do cumentation in a single le. Literate

programming tools then parse the le to pro duce either readable do cumentation or compilable

source co de. The term \literate programming" was coined by D.E. Knuth, when he describ ed

[9, 11] the to ol he created to develop the T Xtyp esetting software. Recent b o ok-length examples

E

of literate programming (using C) include Fraser and Hanson's b o ok on (using noweb) [6] and

Knuth's b o ok on \The Stanford Graphbase" (using CWEB)[10].

Parallel programs quickly b ecome complex and dicult to understand. Therefore, a parallel

program usually needs (extensive) do cumentation to explain the constructions used in the program.

This observation is also valid for sp eci cations written in Promela.

A natural solution is to use the normal /* .. */ comments in Promela sp eci cations. However,

the complexity of the mo dels sometimes require lengthy comments, which obscure the actual

Promela co de.

From our rst exp eriences with Promela we learned that Promela co de with some minor com-

ments is not enough for someone (unfamiliar with the proto col) to understand the mo del and the

reasons b ehind certain mo delling decisions. Moreover, a ` nal' Promela mo del only presents the

last abstraction of the system. The do cumentationofthevalidation of a system, however, should

also contain the complete set of Promela sp eci cations and Spin results that have lead to this ` nal'

Promela mo del. 10

In our most recent validation e ort, Ruys [20] used the literate programming to ol noweb to

tackle this `do cumentation problem'. Using this to ol, we strove to get a single do cumentthat:

 explains the original proto col;

 not only contains the ` nal' Promela mo del, but also contains imp ortant parts of \earlier"

mo dels;

 explains the mo delling decisions and abstractions that lead to this ` nal' Promela mo del;

 contains imp ortant results of validation runs with Spin;

 contains historical notes on the validation tra jectory.

In earlier validation e orts, the information ab ove was spread over many Promela les, several

output les of Spin,andentries in a log b o ok.

noweb [15,16] develop ed by Norman Ramsey is a literate programming to ol likeKnuth's WEB,

only simpler. Unlike WEB, noweb is indep endent of the programming language to b e literated. For

example, noweb is b eing used with Pascal, C, C++, Mo dula-2, Icon, ML, Promela and others.

A noweb le contains program source co de interleaved with do cumentation. A noweb leisa

sequence of chunks, which maycontain co de or do cumentation. A do cumentation chunk starts

with the symbol `@'followed by a space or newline. Do cumentation chunks havenoname. Co de

chunks b egin with:

hchunk name i

A

on a line by itself. Do cumentation chunks contain text fragments in L T X, plain T X or HTML.

E E

A

In our work, we used L T X.

E

In this section, we presentavery small part of a typical Promela mo del. The top level co de

chunk h* i of a mo del, could b e de ned as follows:

11.1 h* 11.1i

hprelude 11.2i

hproctypes 7i

hinit (never de ned)i

where the hprelude i contains the constants, the globals, etc. of the Promela mo del, hproctypes i

contains all Promela proctypes and hinit i contains the Promela init pro cess.

In the left margin, noweb identi es the co de chunk with a tag: page.n, where n indicates the

n-th chunk on page page. This tag is also added as a sux to the co de chunks used in a de nition

of a chunk. The tag for hinit i is (never de ned), b ecause it is not de ned in this pap er.

The hprelude i could b e de ned as follows:

11.2 hprelude 11.2i  (11.1)

hconstants 11.5i

hchannels 12.1i

hmacros 11.3i

In the right margin of the de nition of a chunk, b etween (..) braces, the tag of the co de chunk

is listed, which uses this particular co de chunk. All co de chunks with the same name will be

collected by noweb and put in the place where the chunk is used. To conclude this example, we

de ne some more chunks.

11.3 hmacros 11.3i (11.2) 11.4 .

#define Send(m) pchan[dest]!m

11.4 hmacros 11.3i+ (11.2) / 11.3 12.2 .

#define Receive_DT(sn,ar,eom,nb) pchan[i]?DT(sn,ar,eom,nb)

11.5 hconstants 11.5i (11.2)

#define N 2

#define CHL 2 11

12.1 hchannels 12.1i (11.2)

chan pchan[N] = [CHL] of {tpdu} ;

12.2 hmacros 11.3i+ (11.2) / 11.4

#define AckTimerExpired premature_timeout || timeout

#define DelayTimerExpired timeout

Because wehavede ned three hmacros i chunks, noweb nowprovides some more information in

the right margin; it indicates the previous chunk (/) and the next chunk (.)inthelistofhmacros i

chunks.

Results of noweb

Compared with our rst validation Promela mo dels, the do cumentation on the last Promela mo dels

has grown substantially. The `literate mo del' is now in a state that someone unfamiliar with the

particular MCNet proto col should b e able to understand the working of the proto col from the

literate mo del alone. Imp ortant asp ects of the proto col are now describ ed separately, without

taking into account the complete global Promela mo del. Only the co de chunks needed to explain

a particular asp ect havetobe de ned; noweb will put the co de chunks in the right place. Fur-

thermore, various erroneous mo delling decisions are also incorp orated into the literate mo del to

give insight in the mo delling decisions. Unfortunately,we did not succeed in incorp orating output

results from Spin (e.g. message sequence charts) into our literate mo del.

To describ e a complex concurrent system, the combination noweb and Promela turned out to b e

a good choice. The formal de nition of the system is sp eci ed in Promela (and checked with Spin),

A

while the system itself and its design considerations are describ ed in L T X. On the negative side,

E

there is the problem of howtokeep the noweb le up to date. When analyzing and mo delling the

system, one is tempted to change the Promela le directly, instead of making the changes in the

noweb le, which needs another compilation step to get the new Promela le. Furthermore, one

would like to include veri cation results, message sequence charts in the do cument to illustrate

certain scenarios.

5 Conclusions

The combination Promela/Spin was successful in nding errors in the MCNet proto cols. In a

rst veri cation e ort on preliminary versions of the MCNet (obvious) errors were found. Our

recommendations were adopted by Bosch to construct more reliable proto col descriptions. In

the latest validation e ort, the proto cols proved more reliable and errors were more dicult to

nd. The lackof formal service descriptions of the proto cols b ecame an imp ortantissue: what

prop erties havetobeveri ed?

Mo delling and validation of the design should b e started during the initial design phase. The

fact that the validation team was not directly involved in the design of the system, had several

drawbacks:

 Although the do cumentation on the MCNet proto col is quite extensive, it was not sucient

to understand easily what the proto col was supp osed to b e doing. This was esp ecially true

for the Network Management Proto col. This meant that the mo delling phase to ok quite

some time.

 Furthermore, as the design of MCNet pro ceeded, several do cuments were written. It was

also time consuming (and errorprone) to compare subsequent descriptions of the proto col.

The logging and do cumenting of the mo delling, simulation and validation activities in Spin is very

imp ortant during the validation tra jectory. As long as there is no supp ort for version- and scenario

control in Spin, one has to adopt a rigorous engineering discipline. The easiest wayistologall

activities and validation results in a logb o ok. A more stimulating approachis to use a literate

programming to ol like noweb. 12

Acknowledgements Wewould like to thank the Bosch team: Dietmar Elke, UweZurmuhl and

Hans Pasch, for giving us the opp ortunity to work on this real-world application and for their

endless patience. We would liketo thank Ed Brinksma for managing the validation pro jects in

Twente.

References

[1] ISO-DIS 11519-1. Road vehicles { Low sp eed serial communication, part 1: General and

de nitions. Technical Rep ort 11519-1, International Organization for Standarization (ISO),

June 1994.

[2] ISO-DIS 11519-2. Road vehicles { Low sp eed serial communication, part 2: Low sp eed

Controller Area Network (CAN). Technical Rep ort 11519-2, International Organization for

Standarization (ISO), June 1994.

[3] ISO-DIS 11898. Road vehicles { Interchange of digital information { Controller Area Network

(CAN) for high sp eed communications. Technical Rep ort 11898, International Organization

for Standarization (ISO), Novemb er 1993.

[4] ISO/IEC 7498-1. Information Pro cessing Systems { Op en Systems Interconnection { Basic

Reference Mo del. International Standard 7498-1, International Organization for Standariza-

tion (ISO), Geneva, 1994.

[5] Blaupunkt Bosch Grupp e, Rob ert BoschGmbH. MCNet OSEK Transport Protocol (including

Block Transfer), July 1996. K7/EFK2, MCN TP08.DOC (internal do cument).

[6] Christopher W. Fraser and David R. Hanson. A Retargetable C Compiler: Design and Im-

plementation. Addison-Wesley,1995.

[7] Gerard J. Holzmann. Design and Validation of Computer Protocols. Prentice Hall, Englewood

Cli s, New Jersey,1991.

[8] CAN in Automation (CiA). CAN: A Serial Bus System { Not Just for Vehicles. Internet:

http://www.can-cia.de/introduction.html,February 1997.

[9] DonaldE.Knuth. Literate Programming. Numb er 27 in CSLI Lecture Notes. Center for the

Study of Language and Information (CSLI), Stanford University, California, 1992.

[10] Donald E. Knuth. The StanfordGraphbase: A Platform for Combinatorial Computing.ACM

Press, New York, 1993.

[11] DonaldE.Knuth. Literate Programming. The Computer Journal, 27(2):97{111, 1994.

[12] Rom Langerak. Validation of MCA Transfer Layer Proto col. Technical rep ort,

CTIT/UniversityofTwente, Department of Computer Science, Enschede, The Netherlands,

June 1995. Validation carried out for Rob ert BoschGmbH (con dential rep ort).

[13] Rom Langerak. Validation of MCA (v0.7) Transfer Layer and Network Management Pro-

to cols. Technical rep ort, CTIT/University of Twente, Department of Computer Science,

Enschede, The Netherlands, April 1996. Validation carried out for Rob ert Bosch GmbH

(con dential rep ort).

[14] H.-L. Pasch and D. Ro de. Mobile Communication Architecture { Highlights. Technical

rep ort, Blaupunkt Bosch Grupp e, Rob ert BoschGmbH, Novemb er 1995. K7/EFK2 Pasch,

K7/EAM3 Ro de, MCA HLEN.DOC (internal do cument).

[15] Norman Ramsey. Literate Programming Simpli ed. IEEE Software, 11(5):97{105, September

1994. 13

[16] Norman Ramsey. Noweb. Internet: http://www.cs.virginia.edu/~nr/noweb/,February

1997.

[17] Rob ert BoschGmbH, Mobile Communication Division. Mobile Communication Architecture,

version 0.4 edition, February 1995.

[18] Rob ert BoschGmbH, Mobile Communication Division. Mobile Communication Architecture,

version 0.7 edition, Decemb er 1995.

[19] Rob ert BoschGmbH, Mobile Communication Division. Mobile Communication Architecture

MCNet Layer De nitions, Services and Protocols,version 1.0 edition, August 1996.

[20] Theo C. Ruys. Validation of the (Blo ck) Transfer Proto col and the Network Management

Proto col of MCNet v1.1. Technical rep ort, CTIT/UniversityofTwente, Department of Com-

puter Science, Enschede, The Netherlands, April 1997. Validation carried out for Rob ert

BoschGmbH (con dential rep ort).

[21] Michael J. Scho eld. An Intro duction to CAN - The Serial Data Communications Bus.

Internet: http://www.omegas.co.uk/CAN,February 1997. 14