FLEXIBLE

ENCAPSULATION

Christian S Collb erg

Department of Computer Science

Lund University

CODEN LUNFDNFCS

Department of Computer Science

Lund University

Box

S Lund

Sweden

Email ChristianCollb ergdnalthse

Christian S Collb erg

Cover art by Bud Grace Used by p ermission

c

King Features Syndicatedistr Bulls

Abstract

Most mo dular programming languages provide an encapsulation concept Such

concepts are used to protect the representational details of the implementation

of an abstraction from abuse by its clients Unfortunately strict encapsulation

is hindered by the separate compilation facilities provided by mo dern languages

The goal of the work presented here is to introduce techniques which allow mo

ular languages to supp ort b oth separate compilation and strict encapsulation

without undue translationtime or executiontime cost

The thesis is divided into four parts The rst part surveys existing mo d

ular languages and mo dular language translation techniques The second part

presents the exp erimental mo dular imp erative and ob jectoriented language

Zuse which supp orts strict and exible encapsulation facilities The third

part describ es four translation system designs for Zuse based on highlevel

mo dule binding The fourth part evaluates these systems

The Zuse design applies the principle of orthogonality to encapsulation

every asp ect of every exp orted item may b e either hidden or revealed Sp eci

cally Zuse supp orts three types of exp orted items abstract semiabstract and

concrete These dier in the amount of implementation information revealed to

client mo dules concrete items reveal all representational details semiabstract

items some and abstract items none Exp orted items can also b e paired with

a protection clause that restricts the ways in which they may b e manipulated

by particular clients

The four translation system designs presented in the third part of the thesis

assure through the use of intermediate co de mo dule binding that the cost

in terms of executiontime and storage of using an abstract or semiabstract

item will b e no greater than if the same item had b een concrete In addition

to p erforming the tasks of traditional system link editors the sequential binder

checks deferred context conditions p erforms intermodular optimizations and

generates co de for deferred procedures A deferred pro cedure is one for which

the compiler is unable to generate co de b ecause of references to imp orted ab

stract items Similarly a deferred context condition is a static semantic check

iv

which could not b e p erformed at compiletime The other translation systems

discussed in the thesis p erform the same actions as the sequential binder but

apply dierent techniques to improve p erformance the distributed binder dis

tributes its actions over the sites of a distributed system such as a network of

workstations the incremental binder inserts the co de of mo died mo dules in

place in the executable program and the hierarchical binder binds collections

of mo dules into libraries which themselves can take part in later binds

The sequential and distributed binders have b een implemented and their

p erformance is evaluated in the last part of the thesis To examine the b ehavior

of the binders a model of mo dular programs has b een developed which allows

programs with widely diering qualities to b e generated Results indicate that

the distributed binder runs b etween and : times as fast as the sequential

binder for some programs it is even faster than traditional link editors

Acknowledgments

Many p eople have b een instrumental in bringing this work to a completion I

thank all of you collectively but there are some who I wish to mention sp ecially

First of all I want to thank Magnus Kramp ell with whom I started out as a

graduate student and who should have shared credit for some of the results in

1

this thesis particularly those of Sections ?? and I am also indebted to my

advisor Anders Edenbrandt who should b e awarded a sp ecial medal of honor

a purple heart for enduring my endless questions rantings and ravings I

owe b oth of you a lot

I have also b een fortunate to have two supp ortive p eople serving on my

committee Ferenc Belik and Svante Carlsson By allowing me to participate

in his pro ject Ferenc is indirectly resp onsible for the most interesting results

in this thesis namely those presented in Chapter Svante has always found

the time to talk and listen to my problems and has the gift of oering the

right kind of encouragement at those moments when nothing is going right

Svante also put me in touch with Prof Dr Thomas Strothotte who read and

commented on an early draft of the thesis It was his goahead that nally

made me b elieve that I might actually graduate some day Rolf Karlsson has

over the last few months help ed me cut through the administrative red tap e

necessary to graduate I thank you all

I furthermore wish to thank my onetime fellow graduate students Arne

Andersson Mats Bengtsson Anders Gustavsson Christer Mattsson and Ola

Petersson who have made the last few years b earable

A sp ecial Thank you Bud to Bud Grace for the excellent cover art

I owe everything and then some to my wife Sheila for our partnership

in love childrearing and latenight scientic theological and amorous jaw

sessions Sheilas b elligerent use of red ink has furthermore relieved this man

uscript of extraneous commas It is a comforting thought that out of the

four gestation p erio ds and eventual births that we have endured over the last

1

This work was also presented in Magnus licentiate thesis

vi

six years two kids and two PhD theses the kids were the shortest and turned

out the b est I am certainly indebted to Louise and Andrew for keeping me

sane during the conception of this thesis by b eing living pro ofs of the tenet that

no scientic discovery can match a go o d cuddle

Finally I share with millions of other sons the b elief that I was b orn to the

greatest p erson on earth The dierence b etween me and everyone else is that

I am right This work is therefore lovingly dedicated to you my mother

Contents

Introduction

Background : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :

Programming by Abstractions : : : : : : : : : : : : : : : : : : : :

Contributions and Their Relation to Previous Work : : : : : : :

Mo dular Languages and Their Pro cessors

Introduction : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :

Abstraction and Encapsulation : : : : : : : : : : : : : : : : : : :

Basic Denitions : : : : : : : : : : : : : : : : : : : : : : : : : : :

Mo dules and Separate Compilation : : : : : : : : : : : : : : : : :

Approaches to Encapsulation : : : : : : : : : : : : : : : : : : : :

The Pragmatics of Mo dules : : : : : : : : : : : : : : : : : : : : :

Translating and Executing Mo dular Programs : : : : : : : : : : :

A Survey of Mo dule Concepts : : : : : :

Summary : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :

A Language with Flexible Encapsulation

Introduction : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :

The Zuse Mo dule System and Concrete Exp ort : : : : : : : : : :

SemiAbstract Exp ort : : : : : : : : : : : : : : : : : : : : : : : :

Automatic Initialization : : : : : : : : : : : : : : : : : : : : : : :

Type Protection : : : : : : : : : : : : : : : : : : : : : : : : : : :

Evaluation and Examples : : : : : : : : : : : : : : : : : : : : : :

Summary : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :

The Semantics of Zuse

Introduction : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :

Static Error Conditions : : : : : : : : : : : : : : : : : : : : : : :

Static Semantics : : : : : : : : : : : : : : : : : : : : : : : : : : :

Dynamic Semantics : : : : : : : : : : : : : : : : : : : : : : : : : :

viii

CONTENTS

Summary : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :

Sequential HighLevel Mo dule Binding

Introduction : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :

Design Overview : : : : : : : : : : : : : : : : : : : : : : : : : : :

Intermediate File Formats : : : : : : : : : : : : : : : : : : : : : :

The Constant Expression Table : : : : : : : : : : : : : : : : : : :

The Intermediate Form : : : : : : : : : : : : : : : : : : : : : : :

Inline Expansion : : : : : : : : : : : : : : : : : : : : : : : : : : :

The Compiler : : : : : : : : : : : : : : : : : : : : : : : : : : : : :

The Sequential Paster : : : : : : : : : : : : : : : : : : : : : : : :

Future Work : : : : : : : : : : : : : : : : : : : : : : : : : : : : :

Other Metho ds of Implementation : : : : : : : : : : : : : : : : :

Hierarchical HighLevel Mo dule Binding : : : : : : : : : : : : : :

Summary : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :

Distributed HighLevel Mo dule Binding

Introduction : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :

Distributed Pro cessing Systems : : : : : : : : : : : : : : : : : : :

Mo dels of Network Communication : : : : : : : : : : : : : : : : :

Distributed Translation : : : : : : : : : : : : : : : : : : : : : : :

The Distributed Paster : : : : : : : : : : : : : : : : : : : : : : : :

Phases and : : : : : : : : : : : : : : : : : : : : : : : : : : : :

Phase : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :

Phase : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :

Distributed Relo cation : : : : : : : : : : : : : : : : : : : : : : : :

Future Work : : : : : : : : : : : : : : : : : : : : : : : : : : : : :

Evaluation : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :

Summary : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :

Incremental HighLevel Mo dule Binding

Introduction : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :

Incremental Translation Techniques : : : : : : : : : : : : : : : : :

The Incremental Paster : : : : : : : : : : : : : : : : : : : : : : :

Summary : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :

Evaluation

Introduction : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :

A Mo del of Mo dular Programs : : : : : : : : : : : : : : : : : : :

Exp erimental Approach : : : : : : : : : : : : : : : : : : : : : : :

ix

CONTENTS

Empirical Results and Analysis : : : : : : : : : : : : : : : : : : :

Summary : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :

Conclusions and Future Research

Language Design : : : : : : : : : : : : : : : : : : : : : : : : : : :

Co de Optimization : : : : : : : : : : : : : : : : : : : : : : : : : :

Translation Eciency : : : : : : : : : : : : : : : : : : : : : : : :

Contributions : : : : : : : : : : : : : : : : : : : : : : : : : : : : :

Bibliography

Index x

Chapter

Introduction

It is wel l known that ninetynine percent of the

worlds problems are not susceptible to solution by

scientic research It is widely believed that

ninetynine percent of scientic research is not

relevant to the problems of the real world Yet the

whole achievement and promise of modern

technological society rests on the minute fraction of

those scientic discoveries which are both useful

and true

CAR Hoare in Jones

Background

omputer programming is set apart from other engineering disciplines by

the complexity of the artifacts pro duced not only may a computer pro

contain literally millions of comp onents each the result of distinct

C gram

design decision but it is also the case that the correctness and robustness of

the program may dep end on the b ehavior of each and every one of these indi

vidual comp onents In contrast to disciplines such as civil engineering where

an artifact is made robust by duplicating comp onents likely to fail or by in

creasing the strength of each comp onent a to which more

comp onents are added or in which the complexity of a comp onent is increased

will have greater p otential for failure

A wide variety of research directions have b een pursued with the goal of b et

ter managing this inherent complexity in Developers

of languages to ols and metho dologies for the formal specication and veri

cation of programs hold the view that programs should b e considered pieces of

Introduction

mathematical text and as such should b e sub jected to formal analysis and pro of

of correctness Complexity it is argued arises b ecause insucient attention is

given to the separation of what a program should do its specication which is

a statement b est expressed mathematically how it should b e done the pro

grams implementation and the way a sp ecication is b est transformed into

a correct implementation a pro cess b est carried out using mathematical argu

mentation

Programming language researchers on the other hand maintain that the

b est way of furthering the science of computer programming is to build on the

concepts and notations already familiar to programmers ie to further develop

programming languages to allow the clear and precise expression of designs and

ideas Programs are complex it is argued b ecause the programming languages

in which they are written do not have the appropriate facilities for organizing

the program and expressing the necessary concepts in a natural way

Programming environment designers nally argue that complex computer

programs b ecome manageable when maintained by other programs A program

ming environment can help a programmer to keep track of the interrelationships

of the many comp onents of his program can increase his pro ductivity by mak

ing it easy to add change and test comp onents and can help him towards

a full understanding of the program by allowing him to view it from dierent

p ersp ectives

These approaches are of course not mutually exclusive and we feel all of

them to b e worthwhile elds of study However in this thesis we will mostly side

with the programming language design community We will present renements

to the abstraction primitives found in many systems programming languages

and argue that these renements are essential to harnessing the complexity of

programming We will furthermore argue that our language design and its

accompanying translating system designs have secondary b enets such as the

pro duction of ecient co de and reductions in overall translation time

Programming by Abstractions

The paradigm that p ervades most of the eld of systems programming is pro

gramming by abstractions The idea that distinguishes this paradigm from

others is the following threestep approach to problemsolving

Identify basic concepts inherent in the problem

Isolate these concepts from details regarding their representation

Determine the relationships b etween the concepts

Contributions and Their Relation to Previous Work

This thesis will b e concerned with programming language designs for the pro

gramming by abstractions paradigm particularly with linguistic supp ort for

the second item on this list a principle known as or encap

sulation

The principle of information hiding may also b e paraphrased such that

1

modules should b e manipulated only through welldened interfaces Such

interfaces should b e complete and minimal ie contain enough information

for programmers to use and implement the mo dules but no more information

than that The main advantages of this are that a user of a mo dule need

only consult its in order to determine the services it provides and

how these services should b e used and that an implementer of a mo dule can

determine from the interface exactly which services need to b e implemented

Furthermore since the particulars of a mo dules implementation are not evident

from the interface there is no risk of a user inadvertently making use of such

information a dangerous situation in the event of an implementation change

Many languages with a mo dule concept also supp ort separate compilation

of mo dule interfaces and implementations In addition to reducing the cost of

compilation after small changes separate compilation is also essential for team

programming since it allows many programmers to work indep endently on

dierent parts of the same program Unfortunately however encapsulation and

separate compilation are in one sense conicting concepts encapsulation aiming

to minimize and separate compilation needing to maximize the information

present in interfaces This inherent conict and the way it aects programming

language and translation system design lies at the heart of this thesis

Contributions and Their Relation to Previous

Work

This thesis is at its very basic level concerned with mo dular language trans

lation techniques which increase the ow of information b etween separately

compiled mo dules We will show how such techniques allow the design of pro

gramming languages with strict encapsulation and supp ort the implementation

of intermodular program optimizers more ecient and eective than the ones

currently available More concretely we will prop ose a new mo dular ob ject

oriented programming language with more exible and orthogonal encapsula

tion facilities than other similar languages discuss the problems this language

p oses for its translating system and based on this discussion prop ose and eval

1

The module also known as class package or cluster is the prime programming language

construct for representing the implementation of abstractions

Introduction

uate four concrete translating system designs All four designs classied as

sequential hierarchical distributed and incremental are based on intermediate

co de mo dule binding

No research is p erformed in a vacuum and the contributions of this thesis

naturally extend the work of others As is customary we will indicate our more

2

imp ortant sources of inspiration and at the same time give a brief exp ose of

the structure and contributions of the thesis Numbers in parentheses refer to

sections where a particular work receives more thorough attention

Chapter We present a comparative study of mo dule concepts as they ap

p ear in a variety of programming languages and survey some existing

translating system designs for mo dular languages

Chapter We present a new mo dular ob jectoriented language with exible

encapsulation facilities The design of the language is based on the prin

ciple that any asp ect of a program which can b e revealed should also

b e able to b e hidden Briey the language supp orts fully and partially

hidden abstract and semiabstract types constants and inline pro ce

dures

Precursory forms of exible encapsulation have b een incorp orated into a

small number of languages MilanoPascal supp orts hier

archies of mo dules and a form of abstract types and constants Mesa

and Ob eron use compiler hints to supp ort abstract types

and semiabstract records resp ectively and Mo dula em

ploys runtime pro cessing in order to allow partial revelation of ob ject

types Sale also prop oses a language similar to ours with abstract

types constants and inline pro cedures

Chapter We discuss how the language of Chapter diers semantically from

other similar languages

Chapter We present a novel translating system sp ecically designed to sup

p ort the exible encapsulation facilities of Chapter The translating

system which is based on the binding of intermediate co de achieves

executiontime eciency at the exp ense of translationtime overhead We

also present a hierarchical version of this translating system which is

ecient when some parts of a program are rarely changed

2

Strictly sp eaking the work of Sale and the MIPS group discussed b elow app eared in print

around the same time as the rst published account of the work rep orted in this thesis

and should hence b e considered indep endent work

Contributions and Their Relation to Previous Work

Sale prop oses a translation scheme similar to ours with the intent

to supp ort abstract types and intermodular inlining It was never imple

mented The MIPS compiler suite supp orts intermodular

inlining by concatenating the intermediate co de les of all mo dules prior

to optimization and co de generation Horowitz discusses the exible

use of bindingtime in programming language translators

Chapter We present a distributed version of the translating system de

scrib ed in Chapter The system achieves executiontime eciency with

out any translationtime p enalty and furthermore shows promise as a base

for ecient intermodular optimizations

Chapter We present an incremental version of the translating system in

Chapter which is likely to achieve translationtime eciency when small

lo cal changes are made

Incremental translating techniques have b een studied extensively in the

literature but the system which has inuenced the work in this chapter

n

the most is Linton and Quongs incremental The R

incremental scientic programming environment which

supp orts intermodular optimizations has also b een an inuence

Chapter We present a mo del of mo dular programs with exible encapsula

tion facilities and examine how some of the previously presented translat

ing systems b ehave when sub jected to various types of programs generated

by the mo del

Chapter We conclude the thesis with a discussion of the results achieved

and an indication of directions we intend to pursue in the future

Chapter

Mo dular Languages and

Their Pro cessors

It would take quite a University to accept a

doctoral thesis in which the candidate refused

to dene his central term

Robert M Pirsig

Lambdaman Forget about these unprincipled Modula

opaque types What we need are real abstract types

Pluckless What are real abstract types

Lambdaman Its an abstract type whose corresponding

concrete type is guaranteed to be hidden at runtime as wel l

as at compile time

Anonymous in Nelson

Introduction

he central theme of this thesis is the module the cornerstone on which

programming languages build their concepts of abstraction and

T many

encapsulation This chapter seeks to review the various mo dule con

cepts that have b een prop osed examine their strengths and weaknesses esp e

cially in regard to their facilities for encapsulation and separate compilation

and further to survey the pro cessors translators and executors which have

b een designed with mo dular languages in mind We will use the denitions and

examples presented here to motivate our programming language design in Chap

ter and to compare our translating system designs Chapters through

to the previous designs of others

Modular Languages and Their Processors

In Section we will start by reviewing the concepts of abstraction and

encapsulation Section will introduce some basic denitions related to pro

gramming language concepts for abstraction and mo dularity Section will

introduce concepts which may serve as a basis for the classication of dierent

forms of mo dularity Here we will also discuss how separate compilation facili

ties blend with the mo dule systems found in dierent languages Section is

of utmost signicance to the work presented in this thesis it describ es how dif

ferent mo dular language designs have approached encapsulation and how the

design choices aect the execution eciency and translation cost of these lan

guages Section presents the pragmatics of mo dules ie how mo dules may

b e used Section surveys translation system designs for mo dular languages

with an emphasis on bindingtime op erations In an attempt to visualize the

classication concepts introduced in the chapter Section gives program

ming examples from a total of ten languages and discusses how each language

design has approached the problems inherent in abstraction encapsulation and

separate compilation Finally Section summarizes our ndings

The content of this chapter is not original but rather summarizes original

work by others and furthermore draws heavily on previous surveys The more

imp ortant such inuences include Ambler protection Calliss mo dular

languages Blaschek ob jectoriented languages Conradi separate

compilation Cardelli and Meyer p olymorphism and Wulf

abstraction

Abstraction and Encapsulation

For several decades the most imp ortant programming metho dology has b een

based on the principle of problem decomp osition based on the recognition of

useful abstractions Liskov Here to abstract means to ignore certain

details in an eort to convert the original problem to a simpler one ibid The

rationale of this tenet is that programmers can only keep a limited amount of

information active at any one time and dividing a large and complex program

into smaller and simpler units will make the programming pro cess more manage

able Programming using this metho dology b ecomes the pro cess of identifying

the abstractions in the problem turning each abstraction into a selfcontained

unit and then nally combining the units into a functioning program

An imp ortant concept which often go es handinhand with the abstraction

metho dology is information hiding or encapsulation The essence of this prin

ciple is that each abstraction should only b e manipulated through a welldened

interface The client of such an abstraction will have no need to know just how

Abstraction and Encapsulation

the abstraction is realized only how it should b e used There are many com

p elling reasons for hiding the implementation details of an abstraction from the

scrutiny of clients First of all a client that has no knowledge of the implemen

tation details of an abstraction will never b e tempted to use such knowledge

Hence if the implementation should ever change it will have no eect on the

client provided the interface remains the same A second reason is that an

implementer of an abstraction will when given its welldened interface know

exactly what to implement The implementer may furthermore rest assured

that the integrity of the internal data structures used to implement the abstrac

tion cannot b e compromised by the client

Parnas originally formulated these principles in the following two rules

of information hiding

The sp ecication must provide to the intended user al l the information

that he will need to use the program and nothing more

The sp ecication must provide to the implementer al l the information

ab out the intended use that he needs to complete the program and no

additional information

Abstraction and encapsulation have many applications outside the area of

computer programming In fact information hiding has b een a central part of

most engineering disciplines for a long time For example while to the mate

rials scientist a girder is essentially a particular alloy molded into a particular

shap e to the civil engineer it is a building blo ck a module with certain prop

erties such as dimension and durability The civil engineer the client of the

girder abstraction is not required to know the details of the manufacture of

the particular alloy or of its b ehavior at the molecular level only the prop erties

which are of interest in his design of bridges and houses Similarly the mate

rials scientist the implementer or supplier of the girder abstraction will not

want to keep in mind all the p otential uses of the alloy he is developing only

the required prop erties of the new material The essence of encapsulation and

abstraction is thus to lo ok at problems from the right p ersp ective to o much

detail only gets in the way and not enough detail hinders us from nding the

right solution to the problem at hand

Programming Language Abstraction and Encapsulation

Concepts

Many dierent concepts have b een introduced in programming languages in

order to supp ort the programmingbyabstractions metho dology The most im

Modular Languages and Their Processors

1

p ortant such concepts are modules and classes used for data abstraction and

iterators and processes used for control abstraction Data abstractions de

scrib e the data ob jects which will b e useful in a program and the op erations

needed to op erate on these ob jects Control abstractions are used for the order

ing of the events in the program Often there is a onetoone corresp ondence

b etween the mo dules in a program and the data abstractions that the program

embo dies but as we shall see later there are other uses for mo dules as well

The mo dule may b e the principal abstraction primitive in many languages

but it often also serves to enforce encapsulation A p opular design and the one

which will b e of most interest to us in this thesis is to explicitly divide a mo d

ule into two parts an interface or sp ecication part and an implementation

part In the spirit of the Parnas quote ab ove the interface describ es the mo dule

so that clients can nd out exactly what abstraction the mo dule implements

and so that implemementers can know exactly what needs to b e implemented

Ideally and in keeping with Parnas intentions the sp ecication should con

tain only this information and nothing more As we shall see in the next

few sections however mo dule interfaces in most languages leak information

which really ought to b e conned to the implementation To counter this some

languages have introduced an alternative to the aforementioned concept en

capsulation by hiding namely encapsulation by protection Language designs

which embrace a protection approach to encapsulation allow implementation

details to b e revealed in the mo dule interface but protect these revelations so

that clients cannot easily make use of them

In this thesis we will argue that encapsulation is a fundamental and im

p ortant concept and that programming languages should b e designed so as to

provide maximum control over encapsulation In particular programming lan

guages should have facilities for strict encapsulation ie it should b e p ossible

to sp ecify interfaces to abstractions such that no information regarding the im

plementation of the abstraction is leaked to clients It should b e clear however

that not everyone shares this view Many designers of ob ject orientedlanguages

which have traditionally relied on protection rather than hiding concepts in

particular seem to hold the view that encapsulation is more of a hindrance than

a b o on Stroustroup for example notes that Hiding is for the prevention

of accidents not the prevention of fraud In the same vein Meyer p

says Client designers may well b e p ermitted to read all the details they want

1

Mo dules and classes are very similar concepts distinguished by the fact that mo dules are

static entities while classes are dynamic and that classes are types while mo dules may contain

types Thus a program can contain only one copy of each mo dule but several instances of

each class In the following we shall mostly limit our discussion to mo dules although it should

b e clear that much of what we have to say applies to classes as well

Abstraction and Encapsulation

but they should b e unable to write client mo dules whose correct functioning de

p ends on private information Our exp erience is that even accidental breaches

of encapsulation may have consequences resembling a fraudulent attack and

that it is therefore imp ortant for mo dules to hide as much implementation in

formation as p ossible from the scrutiny of clients The conclusion must b e that

programming languages should provide as secure encapsulation primitives as

p ossible and that these should rst and foremost b e based on hiding rather

than protection

The Structure of a Sp ecication

So far all we have said ab out the sp ecication of a mo dule is that it should

contain a minimum of information only enough for a client to use it and an

implementer to implement it But what kind of information is actually needed

Homan divides a sp ecication into four parts The INTRODUCTION sec

tion gives a short natural language description of the mo dule and the services

it provides the SYNTAX section describ es how a client should syntactically

use the op erations provided by the mo dule the EFFECTS section describ es the

mo dules b ehavior or semantics ie what happ ens when the op erations are

used and the EXCEPTIONS section describ es the conditions under which the

op erations listed in the SYNTAX section will fail

Few programming languages allow all four sections to app ear in a mo dule

sp ecication The EFFECTS section in particular is often missing However

almost all languages we will consider allow or require sp ecications to contain

the declarations of data types and pro cedure headings ie enough information

to allow crossmo dule type checking We will discuss this more thoroughly in

Section

A sp ecication as we have describ ed it here is related to Meyers software

contracting The sp ecication of a mo dule should in Meyers opinion b e a

contract b etween a client and a supplier of an abstraction The contract states

that provided the client uses the abstraction in accordance with the rules stated

in the SYNTAX and EXCEPTIONS sections the b ehavior of the abstraction will

b e as describ ed in the EFFECTS section

Relationships Between Abstractions

The abstractions and hence the mo dules in a program can b e related to one

another in dierent ways Two mo dules may for example b e involved in an

importexport relationship meaning that one of the mo dules the imp orter or

client directly uses some of the features dened by the other the exp orter or

Modular Languages and Their Processors

supplier If we are dealing with data abstractions the relationship will often

b e such that the imp orter uses the exp orters data ob jects as a part of its own

data in which case the relationship is one of aggregation All languages with a

mo dule concept provide these kinds of imp ortexp ort mechanisms

A fundamentally dierent kind of relationship is inheritance which orga

nizes abstractions hierarchically much like animals and plants are organized

in families genera and sp ecies We say that if an abstraction B inherits from

an abstraction A then B has all of As features and p ossibly some additional

features B is then said to b e a specialization of A much like a lion is a sp ecial

kind of cat which is a sp ecial kind of mammal etc We also say that B is an

extension to A since it may have extra features in addition to the ones it has

inherited from A

Wegner provides a convenient classication of languages based on

whether they provide mo dules classes and inheritance A language where

classes can b e related through the inheritance relation is said to b e object

oriented a language with classes but not inheritance is classbased and a lan

guage which has some linguistic concept such as mo dules for data abstraction

but not classes or inheritance is objectbased Thus the three terms form a hi

erarchy such that ob jectbased languages form a prop er subset of classbased

languages which form a prop er subset of ob jectoriented languages

Basic Denitions

In the previous section we introduced in general terms the notions of ab

straction mo dule and encapsulation Now that the scene of discourse has b een

set it is appropriate to stop and regroup and provide some solid terminology

to help us talk ab out how these terms relate to programming language design

There is a proliferation of terminology asso ciated with programming languages

and programming paradigms and much of it is confusing redundant and ar

bitrary For the b enet of the discussions in this and the following chapters we

will adopt our own terminology adapted from a variety of sources This section

will only give denitions of the most fundamental terms more sp ecialized terms

will b e dened as needed

Items and Mo dules

A declarative item or item for short in a language is any syntactic category

which may b e declared ie given a name and manipulated within the realms

of the language Typical items are types variables constants and pro cedures

but may also include macros exceptions mo dules pro cesses classes etc

Basic Definitions

A module encapsulates a collection of items and restricts the ways in which

2

they may b e manipulated by other mo dules In order to b e able to distinguish

mo dules from pro cedures we require that a mo dule b e able to export the items

which it wants to make available to other mo dules and selectively import the

items needed from other mo dules We furthermore require that a mo dule should

not b e callable

A class is a typed mo dule Mo dules and classes are therefore distinguished

by the fact that the former are static there can b e only one instance p er

program and the latter dynamic a program can create many instances of a

class at runtime

A mo dule may b e made up of several parts or units Often these come

in two avors specication and implementation units The sp ecication units

dene the capabilities of the mo dule and list the items the mo dule exp orts

to its clients The implementation units contain the realizations of the items

listed in the sp ecication units Most languages make some restrictions on the

maximum number of sp ecication and implementation units p er mo dule and

these restrictions vary from language to language Restrictions are also often

imp osed on the kinds of items which a mo dule may exp ort and how these items

may b e divided b etween the dierent units

Denitions and Realizations

The denition and realization of an item may mean dierent things to dierent

kinds of items As a general rule however the denition of an item contains all

the information necessary in order for it to b e used in a correct way according

to the static semantic rules of the language The realization of an item on

the other hand contains the information necessary for the item to b e correctly

represented in the memory of an executing program As a concrete example

we will consider named or manifest constants In a statically typed language

the denition of a constant must at least include its name so that the constant

may b e referred to and its type so that the static semantic correctness of a

use of the constant may b e determined

C INTEGER

2

Providing an allencompassing denition of the term module has apparently b een a prob

lem to many authors and many fail to dene the term at all Even David Parnas the early

champion of mo dularity and information hiding decides to use the term without a precise

denition

Modular Languages and Their Processors

The realization of a named constant on the other hand must at least contain

its value

C INTEGER

Similarly the denition of a subprogram must contain its name the types

mo des and order of its formal parameters and the return type in case of a

function This information sometimes referred to as the subprogram header is

sucient and necessary in order to p erform strict static semantic analysis of a

call to the subprogram The subprogram realization is usually its co de together

with the realizations of any items dened lo cally

Abstract and Concrete Items

In the same vein we will consider abstract and concrete items and mo dules In

tuitively an item is concrete if its realization is visible to a user and abstract if

the realization is invisible More formally we say that an item is abstract if its

sp ecication unit denition contains only the information necessary to p erform

strict static semantic analysis of any use of the item and no more information

Similarly an item is concrete if its denition as wel l as its realization is given

in the sp ecication unit Some languages allow part of an items realization to

b e given in a sp ecication unit and the remainder in the implementation We

will call such items semiabstract and denote the sp ecication and implementa

tion unit parts the visible and hidden parts resp ectively Further an abstract

concrete mo dule is one which contains only abstract concrete items

Other Terminology

To allow for some variation of expression in this very restricted domain of

discourse the terms interface and specication will b e treated as synonyms as

will representation and realization module and package and module part and

module unit By abuse of terminology we may also refer to the realization of

a mo dule thereby meaning its implementation unit

Mo dules and Separate Compilation

In this section we will examine and classify some of the mo dule systems which

have b een prop osed and realized in concrete language designs We will pay

particular attention to facilities for separate compilation and the separation of

sp ecication and implementation units since this is of sp ecial interest to us

in this thesis The classication schemes prop osed here will b e put to use in

Section where some real language designs will b e surveyed

Modules and Separate Compilation

Sp ecications and Implementations

As has already b een mentioned dierent languages imp ose dierent constraints

on the way mo dules may b e divided into sp ecication and implementation units

The following list categorizes languages according to the corresp ondence b e

tween the number of sp ecications and implementations

zerotoone Several languages among them Euclid Ob eron and Eiel do

not provide separate mo dule sp ecications Instead the text of a mo dules

implementation unit contains an indication of the items which the mo dule

exp orts

onetoone A Mo dula mo dule has one sp ecication and one implementation

3

unit

onetoatmostone An Ada mo dule has one sp ecication and zero or one

implementation units A mo dule without an implementation unit is re

stricted to exp orting types variables and constants

onetoatleastone Collb erg suggests a mo dule system with linguistic

supp ort for multiple implementations of the same sp ecication

manytomany The relationship b etween Mo dulas interfaces and modules

and Mesas denitions and program modules sp ecication and implemen

tation units in our terminology is more general than the ntom mo dule

systems discussed ab ove A pro cedure dened by a particular sp ecica

tion unit can for example b e realized by any one of the implementation

units in the program and a particular implementation unit can realize

pro cedures dened by several dierent sp ecications

It should b e kept in mind that the list ab ove is concerned only with linguis

tic mechanisms for mo dularity A programming environment for a particular

programming language may provide facilities which make it p ossible to simu

late the existence of separate sp ecication units or several alternative

implementations of the same sp ecication although the language prop er do es

not provide linguistic mechanisms for these kinds of mo dule concepts The

MIT CLU programming environment for example handles

mo dule sp ecications as well as implementations although the language do es

not provide any explicit syntax for sp ecication units The MIT environment

can also handle several implementations of the same sp ecication even though

3

We ignore the fact that the Mo dula main program mo dule do es not have a separate sp ecication

Modular Languages and Their Processors

the language denition do es not state how an application may choose b etween

the various implementations Similarly the Eiel programming environment

provides a sp ecial op eration which extracts a mo dules sp ecication from its

implementation This sp ecication however is only for programmer conve

nience and do es not take part in the pro cessing of the mo dule

Although their mo dule systems are much less restrictive than those of other

languages it is common for actual Mesa and Mo dula mo dules to exhibit a

onetoone relationship b etween sp ecications and implementations A two

toone corresp ondence is also often used in Mo dula to give a mo dule one

abstract sp ecication for use by client mo dules which do not need to access

the representation of the mo dule and one concrete sp ecication for client

mo dules which mo dify the mo dules b ehavior in some way A onetomany

corresp ondence is common in Mesa as a means of dividing a large mo dule into

several smaller separately compilable pieces

Imp ortbyName or Imp ortbyStructure

In most languages a mo dule which needs to refer to an item exp orted by another

mo dule imports that mo dule and then simply uses the name of the desired item

Several varieties of this scheme have b een devised

qualied imp ort The construction IMPORT M in Mo dula is known as

qualied imp ort The imp orter can refer to Ms items by use of the dot

notation Mi means item i in mo dule M

unqualied imp ort Mo dula also p ermits the construction

FROM M IMPORT i which allows imp orters to refer to i directly with

out qualication

qualied and unqualied imp ort The two previous forms of imp ortation

can b e combined in Mo dula IMPORT M FROM M IMPORT i

This allows b oth qualied Mi and unqualied i references

overloaded imp ort Ada allows qualied imp ort with M but also p ermits

a client to imp ort al l the exp orted items of a mo dule with M use M

Adas rules for overloading resolution are used to disambiguate references

to names dened by several mo dules imp orted this way

Another interesting though rarely used approach to referencing external

items is importbystructure The basic idea is that rather than sp ecifying the

name of the mo dule from which a particular item should b e imp orted a client

Modules and Separate Compilation

sp ecies the items structure and the translating system is resp onsible for nd

ing a mo dule which provides a matching item One can imagine a language in

which a client in need of a stack abstraction gives the b ehavior sp ecication us

ing pre and p ostconditions for example of Push and Pop and the translating

system searches all mo dules for one which exp orts these op erations with the

matching b ehavior To the b est of our knowledge no such language is available

to day the one closest in spirit is MilanoPascal although this language do es

not supp ort sp ecication of b ehavior only syntactic sp ecication see section

Mo dules and Polymorphism

A further distinction b etween dierent mo dule concepts is to what extent a

mo dule may b e parameterized A parametric mo dule is a template representing

a family of mo dules with similar b ehavior Generic modules represent the most

common form of mo dule parameterization and are present in one form or an

other in languages such as Ada CLU Eiel and Mo dula Usually a generic

mo dule is allowed to take one or more type parameters but some languages

most notably Ada also p ermit constant and pro cedure parameters Generic

mo dules are most often used to implement container abstractions lists sets

bags and maps which have identical b ehavior regardless of the type of data

stored

Meyer distinguishes b etween two kinds of genericity constrained and

unconstrained An unconstrained generic mo dule M with a formal type pa

rameter T may p ose no restrictions on the kinds of types a client may use in

an instantiation of M A constrained generic mo dule on the other hand may

require the instantiating type to fulll certain requirements such as having an

equality op erator dened b eing enumerable etc

Parameterization is one form of polymorphism but there are many others

Languages such as Eiel whose mo dule systems are based on classes with inher

itance supp ort a form of p olymorphism known as inclusion polymorphism

The basic idea is that a sub class can app ear wherever the class itself can app ear

Some ob jectoriented languages Ob ject Pascal Mo dula dene classes and

inheritance at the type system level rather than at the mo dule system level

Naturally these languages to o supp ort inclusion p olymorphism

Sp ecication of Behavior

In Section we noted that the interface to an abstraction should contain four

kinds of information one of which should describ e the b ehavior or semantics

Modular Languages and Their Processors

of the abstraction A few mo dular languages have direct linguistic supp ort

for semantic sp ecication Euclid Alphard Eiel while others have dened

ancillary sp ecication languages Anna for Ada traces for C Larch for

Mo dula This is still an imp ortant area of research and no consensus

has b een reached on whether sp ecications should b e languageindep endent or

dened within the language prop er or on what type of sp ecication paradigm

should b e supp orted A more detailed analysis of the problems involved is

outside the scop e of this thesis

Separate Compilation

Compilation units are language constructs which may b e submitted for com

pilation in source les textually separate from the rest of the program There

is some variety among languages as to what constitutes a compilation unit

but most mo dular languages at least allow mo dules to b e separately compiled

In fact in many languages the mo dule system is so tightly coupled with the

separate compilation facilities that programmers often think of them as one

and the same An Ob eron mo dule for example is the only construct in that

language which may indeed must b e compiled separately At the other end

4

of the sp ectrum is Alphard which allows any kind of declarative item to

b e separately compiled

A language denition must settle two issues in regard to the languages

separate compilation facilities First of all the denition needs to state what

language constructs may serve as compilation units Secondly it must state to

what extent a translating system may imp ose an order of compilation b etween

dierent types of units Languages with explicit mo dule interfaces usually im

p ose an order of compilation b etween the exp orting mo dules interface and the

imp orting unit in order for the translating system to b e able to enforce strict

intermodular static semantic correctness Thus if a unit B imp orts from a unit

A such languages require the interface of A to b e compiled prior to B Lan

guages which do not require strict type checking b etween separately compiled

mo dules FORTRAN and C for example do not need to imp ose an order

of compilation b etween mo dules Separate compilation in these languages is

known as independent compilation

To exemplify we will consider Ada and Mo dula which have slightly dier

ent compilation dep endency rules Mo dula allows compilation dep endencies

to exist b etween two sp ecication units and b etween a sp ecication unit and

an implementation unit but not b etween two implementation units This is

4

It should b e noted that Alphard has never b een completely implemented

Modules and Separate Compilation

the preferred compilation dep endency rule since it guarantees that once a pro

grams interface units have b een compiled the implementation units may b e

compiled in an arbitrary order Consequentially future changes to implemen

tation units will only require the recompilation of the aected units themselves

Ada on the other hand has more complex rules which p ermit dep endencies b e

tween the implementation units of two mo dules involved in an imp ortexp ort

relationship if the exp orting mo dule is generic or exp orts inline pro cedures In

b oth these cases a compiler needs access to the co de in the exp orters imple

mentation unit when compiling the imp orters implementation in order to b e

able to p erform generic instantiations or pro cedure integrations

Languages with a separate compilation facility have two imp ortant advan

tages over monolithic languages compilation time is reduced since only aected

mo dules need to b e recompiled after a change and several programmers can

work concurrently and indep endently on dierent parts of the same program

However very complex and restrictive compilation dep endency rules may nullify

all of the advantages of separate compilation One may imagine for example

an Ada program in which all mo dules dep end on a particular mo dule which

exp orts an inline pro cedure According to Adas compilation rules the entire

program will have to b e recompiled whenever the b o dy of the inline pro cedure

is changed If this is a frequently o ccurring phenomenon separate compilation

may turn out to b e slower than monolithic compilation Furthermore these

kinds of restrictive compilation rules also make it more dicult for program

mers to work indep endently a programmer who makes a seemingly inno cuous

change to one of his own implementation units may force all other programmers

to recompile all of their mo dules

Trickledown recompilation or the transitive propagation of p ossible inter

face changes is a related phenomenon which also negatively aects the

usefulness of separate compilation Since the compilation rules of languages

with explicit interfaces require the recompilation of al l units which transitively

dep end on an altered interface a small change such as the addition of a new

op eration or even the correction of a sp elling mistake in a comment to the

interface of a basic mo dule may in the worst case cause the recompilation of

the entire program The more information contained in the interface units of a

program the more acute the risk that the program will suer from many trickle

down recompilations during its lifetime As we shall see in Sections and

certain languages most notably Ada require interfaces to contain extraneous

information Programs written in these languages are thus more prone to suf

fer from trickledown recompilations than programs written in other languages

Adams proves this p oint in his analysis of the change history of a particular

Ada program of all changed units were sp ecications

Modular Languages and Their Processors

Approaches to Encapsulation

Many of the problems regarding abstraction and encapsulation that designers of

recent imp erative and ob jectoriented mo dular languages have b een wrestling

with originate in a desire to satisfy four requirements

to allow fully abstract mo dule sp ecication units

to disallow compilation dep endencies b etween mo dule implementation

units

to disallow language features which would entail undue executiontime

overhead

to disallow language features would entail undue translationtime over

head or which would require a nonstandard translation system

It has turned out to b e a dicult task to accommo date all four of these

goals in the same design The reason is that a fully abstract separately com

piled mo dule interface while containing enough information for a client to use

the mo dule and the implementer to implement it do es not contain enough

information for a standard mo dule compiler to generate ecient co de for the

mo dules clients In Section we will examine a number of concrete language

designs which have selected slightly dierent compromises in the attempt to

accommo date as many as p ossible of the four conicting goals In this section

we will discuss the seven basic approaches employed by these languages each

striking a dierent balance b etween level of interface abstraction complexity of

compilation rules and translationtime or executiontime overhead We lab el

them as follows

realization restriction An exp orted item may b e fully abstract but the lan

guage restricts the range of p ossible realizations

realization revelation The realization of an abstract item is revealed in the

mo dule sp ecication

realization protection Similar to realization revelation but the realiza

tion is coupled with a restriction on the kinds of op erations an imp orter

may p erform on the exp orted item

partial revelation The realization of an abstract item is deferred to the mo d

ule implementation but in order to facilitate translation some asp ects

of the representation of the item are revealed in the mo dule sp ecication

Approaches to Encapsulation

implementation ordering There is an order of compilation imp osed on im

plementation units

bindingtime supp ort Intermodular information needed in order to imple

ment abstract items is assumed to b e exchanged when mo dules are joined

together to form an executable program

runtime supp ort The use of an abstract item is more exp ensive in terms

of the memory or time required at runtime than the use of the corre

sp onding concrete item

Mo dula and Mo dulas abstract types known as opaque types are the

prime example of realization restriction since they may only b e realized as

5

p ointer types This rule ensures that a Mo dula or Mo dula compiler which

only has access to the information given in sp ecication units will always know

the size of an imp orted abstract type

Inline pro cedures in Mesa and C fall in the realization revelation cat

egory The b o dies of inline pro cedures are revealed in the sp ecication unit

in order to give the compiler access to the pro cedures co de when pro cessing

imp orting mo dules

Adas abstract types private and limited private types fall in the realization

protection category An Ada private type reveals its realization in the mo dule

sp ecication unit but the compiler refuses a client access to the internals of

the realization The reason for revealing the realization is to make the size of

the type available to the compiler Ada favors static allo cation over dynamic

allo cation and without compiletime knowledge of the sizes of all types static

allo cation would not b e p ossible

Mesas opaque type which in regard to degree of encapsulation falls in

b etween Adas private type and Mo dulas opaque type is an example of

partial revelation Much like in Mo dula the realization of a Mesa opaque

type is deferred to the mo dule implementation but unlike Mo dula there is

no restriction imp osed on the realization Instead the mo dule sp ecication

must reveal the maximum size of the realization A similar approach is used for

the implementation of Ob eron public projection types record types which may

reveal some elds in the sp ecication unit and defer some to the implementation

unit

One advantage that Mesas opaque type and Adas private types have over

Mo dula and Mo dulas opaque types is that they p ermit static allo cation

Mo dula and Mo dulas opaque types on the other hand must b e allo cated

5

This is the rule imp osed by the Mo dula standard Actual implementations often relax

the rule to include any type with the same size as a p ointer

Modular Languages and Their Processors

on the heap Thus in Ada and Mesa there is never any need for explicit de

allo cation of variables of abstract types since these are automatically removed

from the execution stack when the ow of control reaches the end of the scop e

in which they were declared In the case of Mo dula variables of opaque

types have to b e explicitly deallo cated while in Mo dula a garbage collector

removes unreferenced dynamically allo cated variables In one resp ect how

ever Mo dula and Mo dulas opaque types have the edge on Mesas opaque

and Adas private types changes to the realizations of the latter may trigger

trickledown recompilations while changes to the former only aect the mo dule

in which they are declared

Ada generic mo dules and Ada mo dules that exp ort inline pro cedures may

fall in the implementation ordering category since it is legal for the compiler to

require that the implementation units of such mo dules b e compiled prior to any

client implementation units Languages without explicit mo dule sp ecication

units such as Eiel or Ob eron also fall in this category

The only programming language which falls in the bindingtime support

category is MilanoPascal which requires that intermodular information re

garding its abstract types and constants b e exchanged at mo dule bindingtime

Inline pro cedures in the MIPS compiler suite also fall in this category The

problem with this category is of course that extensive bindingtime pro cessing

can cause excessively long turnaroundtimes

The Alb ericht an Ob eron descendant public projection type and Mo dulas

opaque ob ject type at least as implemented in the currently available trans

lating system require extra work at runtime and hence fall in the runtime

support category This extra work as we shall see in Section involves

the calculation of osets of elds and metho ds and the construction of ob ject

type templates work that other translating systems p erform at compiletime

Languages which fall in this category may thus have problems with execution

eciency

The Pragmatics of Mo dules

Abstract data types may b e the most frequently o ccurring abstractions but

they are certainly not the only ones There have b een several attempts at

classifying mo dules according to the way they are used and the abstractions

they implement We will call this the pragmatics of mo dules Bo o ch gives

four uses of Ada mo dules see Ross for a slightly dierent classication

Named collections of declarations

The Pragmatics of Modules

Groups of related program units

Abstract data types

Abstract state machines

A typical example from the rst category is a mo dule which exp orts only

manifest constants These may b e related physical or mathematical constants

or constants which describ e the hardwired limits of a particular program The

mo dule in Figure is an example of such a mo dule taken from the translat

ing systems describ ed in Chapters through The programming language

notation used is that which will b e introduced in Chapter

SPECIFICATION Conguration

TYPE

String ARRAY RANGE CARDINAL CHAR

CONSTANT

MaxNrOfMo dulesPerProgram CARDINAL

MaxNumb erOfSites CARDINAL

ImplementationExtension String

Sp ecicationExtension String

END Conguration

IMPLEMENTATION Conguration

CONSTANT

MaxNrOfMo dulesPerProgram CARDINAL

MaxNumb erOfSites CARDINAL

ImplementationExtension String imp

Sp ecicationExtension String spe

END Conguration

Figure Example of a mo dule from Bo o chs category Named collections of

declarations

A mo dule in the second category Groups of related program units typi

cally exp orts no types or constants only pro cedures Examples include

mo dules for trigonometric or other mathematical functions mo dules contain

ing pro cedures for converting b etween data formats etc Again we provide

an example from the implementation of the translating systems in Chapters

through The mo dule Bits in Figure provides op erations for the low

level manipulation of machine words not provided in the language in which the

translating systems were written Again we use the language to b e describ ed

in Chapter

Modular Languages and Their Processors

IMPLEMENTATION Bits SPECIFICATION Bits

INLINE PROCEDURE ShiftLeft PROCEDURE ShiftLeft

REF B BITSET REF B BITSET

N CARDINAL N CARDINAL

BEGIN END ShiftLeft PROCEDURE ShiftRight

REF B BITSET

INLINE PROCEDURE ShiftLeft N CARDINAL

REF B BITSET

N CARDINAL PROCEDURE Insert

BEGIN END ShiftRight Source BITSET

Low High CARDINAL

REF Dest BITSET

END Bits PROCEDURE Extract

Source BITSET

Low High CARDINAL

REF Dest BITSET

END Bits

Figure Example of a mo dule from Bo o chs category Groups of related

program units

A mo dule implementing an Abstract will at least exp ort a type

and a number of pro cedures which op erate on the type Often the mo dule will

also exp ort some auxiliary type and maybe a constant providing a default or

nul l value for the type In classbased and ob jectoriented languages the class

will not explicitly have to name and exp ort a type since the class itself is the

type We will not provide any abstract data type examples here since there

will b e plenty of examples in Section of this chapter and Section of the

next chapter

An abstract state machine mo dule nally diers from an abstract data type

mo dule by having an internal state which is op erated on by clients through

the exp orted pro cedures The mo dule may of course also exp ort types and

constants but the pro cedures are the main ob jects An abstract state machine

is essentially a black b ox a b ox with dials op erations which change the state

of the b ox and readout indicators op erations which examine the current state

The dierence b etween an abstract state machine mo dule and an abstract data

type mo dule is that while there can only exist one copy of the former the latter

can create many instances of its exp orted data type such that each instance is

itself an abstract state machine Figure gives a simple example of an abstract

state machine mo dule

Translating and Executing Modular Programs

SPECIFICATION Elevators

PROCEDURE Init NrOfElevators CARDINAL

PROCEDURE GoTo Flo or CARDINAL

PROCEDURE CallFrom Flo or CARDINAL

PROCEDURE AtFloor RETURN CARDINAL

END Elevators

IMPLEMENTATION Elevators

IMPORT Threads CardToCardMap

VARIABLE NowAt CardToCardMapT

END Elevators

Figure Example of a mo dule from Bo o chs category Abstract state ma

chines The example is adapted from Ross Init GoTo and CallFrom

change the state of the machine whereas AtFloor queries the current state of

the machine

Translating and Executing Mo dular Programs

A separately compiled mo dular program go es through several pro cessing stages

on its way from infancy its source co de form to adultho o d its executing image

in the memory of a computer We identify four such stages

Compilation Check the static semantic correctness of and generate co de for

each mo dule

Binding Combine the co de generated for each mo dule into one program Re

solve intermodular references

Loading Load the program generated during binding into the memory of the

computer

Execution Run the loaded program

Dep ending on the type of translating system used any one of these stages

except the execution stage may b e missing or may b e designed to do more or

less pro cessing than indicated here In this section we shall examine some of

the translating systems which have b een prop osed for mo dular programming

languages and compare their strengths and weaknesses Of sp ecial interest to us

in this thesis will b e mo dule binding and we will therefore pay sp ecial attention

to this stage of translation

Modular Languages and Their Processors

Compiling Mo dular Programs

Compiling monolithic mo dular programming languages languages with a mo d

ule concept but without a separate compilation facility is not much dierent

from compiling programming languages without a mo dule concept Euclid see

Section falls in this category Languages with separately compiled mo d

ules are more challenging particularly if mo dules may b e divided into textually

6

separate sp ecication and implementation units The main challenge is to pro

vide compile time type checking across mo dule b oundaries This means that

there has to b e some way for the compiler to determine the names and types of

the items exp orted by the imp orted mo dules A common design is to asso ciate

with each mo dule a symbol le which describ es the mo dules exp orted items

This symbol le is read by the compiler when pro cessing the mo dules clients

Gutknecht identies the following classes of symbol les

class A The symbol le for a mo dule M contains only the items found in Ms

sp ecication unit

class B The symbol le for a mo dule M contains the items found in Ms

sp ecication unit as well as any imp orted items on which these transitively

dep end

class The symbol le data is in a source co de or a slightly pro cessed source

co de format

class The symbol le data is essentially an enco ding of the compilers symbol

table

The dierence b etween class A and B is one of compilation eciency A

class B compiler only needs to load the symbol les of directly imp orted mo d

ules whereas a class A compiler must load all the symbol les which transitively

dep end on the directly imp orted mo dules Class and also dier in regard to

eciency compiled declarations can b e loaded faster than declarations still in

their source co de format The A approach used by Foster is somewhat

similar to the includele mechanism used for separate compilation in the C

and C languages except that each imp orted mo dule op ens up a new scop e

for its items Most often approaches are implemented by letting symbol les

b e the sp ecication units themselves This has the advantage that sp ecication

units do not need to b e compiled and that there do es not need to exist any

explicit symbol les The disadvantage is that errors in a sp ecication unit

6

Units are textual ly separate if each unit resides in its own le

Translating and Executing Modular Programs

are not caught until a client is compiled See Gutknecht for a detailed

description of the B metho d

Once a mo dular language compiler has loaded the symbol les of imp orted

mo dules compilation can pro ceed in much the same way as in a compiler for

a monolithic language One complication is name lo okup for imp orted items

and the metho ds employed must dep end largely on the language compiled Ada

uses overloading to resolve name clashes Mo dula allows b oth qualied and

unqualied references to imp orted items Eiel allows only qualied reference

etc See Calliss for a more detailed classication In languages which

take a protection approach to encapsulation the compiler must of course also

check that client mo dules do not make illegal references to imp orted protected

items A nal problem for compilers using the class B approach describ ed

ab ove involves consistency checking b etween imp orted mo dules Consider for

example the case where mo dule Ms implementation unit imp orts mo dules A

and B b oth of whose sp ecication units imp ort mo dule C Assume that after

a change to C s sp ecication unit C and A are recompiled but not B When

M is recompiled it will read two incompatible versions of C s symbol le one

from As symbol le and one from B s Most compilers counter this problem

by including in each symbol le the time when the corresp onding sp ecication

unit was compiled The compiler can then easily check these timestamps to

assure that the imp orted mo dules have b een compiled in the correct order

Maintaining the Correct Order of Compilation

In the discussion ab ove and in our description of the orderofcompilation rules

imp osed by languages with a separate compilation concept Section we

assumed that the rules were enforced simply by the detection of the mo dules

which have changed since the last compilation This is indeed often the case

and in most systems it is accomplished by checking the le modication date

of the les in which the mo dule source and ob ject co de are stored The pro

cess is most often automated through the use of programs such as the Unix

make facility which recompiles aected mo dules after editing

changes This is a very coarsegrained approach since as we have already noted

simple formatting changes can lead to massive recompilations To counter this

more negrained approaches have b een developed Known as smart recompi

lation these metho ds take into account not only which modules have

b een changed but also which items in the mo dules have b een changed Un

fortunately smart recompilation metho ds have met with limited success since

they require compiler co op eration and since the overhead of checking whether

a particular mo dule needs to b e recompiled sometimes outweighs the cost of

Modular Languages and Their Processors

actual recompilation

Binding Mo dular Programs

Most compilers for separately compiled mo dular languages pro duce one object

code le p er source co de mo dule This le contains the co de generated for the

subprograms exp orted by the mo dule space for the static variables declared in

the mo dule and the values of the mo dules structured manifest constants In

addition to this information the ob ject co de le often contains an enco ding of

the mo dules symbol table to b e used by debuggers and mo dule binders

A module binder is a program which combines the ob ject co de les pro duced

by compilers or assemblers into a le which can b e directly loaded and exe

cuted Most op erating systems contain a simple mo dule binder often called a

linkageeditor or linker which handles the binding requirements of assemblers

and languages with rudimentary mo dule systems such as C and FORTRAN

The main task of these system linkers is to gather together the co de and data

pro duced by the compiler for the mo dules which are to take part in the re

sulting program to resolve references b etween these mo dules this is known as

and to include whatever other systemsp ecic routines the program

will need at runtime The link editor often also serves as the main instrument

for interlanguage programming ie the linker allows mo dules written in one

language to b e b ound together with mo dules pro duced by compilers for other

languages

Sp ecial Linking Techniques

Languages with mo dule concepts more advanced than C and FORTRAN often

require more complex linktime pro cessing than what is oered by ordinary sys

tem link editors Often this problem is handled through the use of a prelinker

a program which is designed to take care of the sp ecial linktime requirements

of a particular language and which as a part of its pro cessing calls the system

linker to handle the relo cation and concatenation of co de and data Typical

tasks p erformed by prelinkers include checking timestamps similar to the

compiletime checks discussed previously nding the mo dules that should b e

part of the nal program and determining the order of execution b etween mo d

ule b o dies

In op erating systems where several pro cesses can execute concurrently it is

often the case that certain mo dules or groups of mo dules are used by several

programs simultaneously Examples include mo dules for buered IO math

ematical routines and graphics primitives To avoid having several copies of

Translating and Executing Modular Programs

the same mo dules in core at the same time some op erating systems employ a

technique known as dynamic linking A mo dule is linked dynamically if several

concurrently executing programs can share the same copy of the mo dule If

a particular mo dule is to b e linked dynamically rather than statically into a

certain program the linker refrains from including its co de and data into the

resulting executable le Rather when the program is loaded or when a routine

in the dynamic mo dule is called for the rst time the mo dules ob ject co de le

is loaded into main memory and crossmo dule references are relo cated If how

ever at this p oint the ob ject co de le is already present in main memory due

to the fact that it is b eing used by another concurrently executing program

the loading can b e disp ensed with

Another linking technique sometimes called smart linking is supp orted by

some linkers to reduce the size of the executable le The idea is to try to

avoid including co de and data in the resulting executable le which could never

b e referred to at runtime The linker accomplishes this by constructing the

callgraph of the pro cedures in the program and traversing the graph starting

at the main mo dule to determine the pro cedures which might b e called The

result is of course never more than a conservative estimate of the pro cedures

which are actual ly called at runtime

Two techniques have b een employed in attempts to sp eed up the linking

pro cess incremental and hierarchical linking An incremental linker saves the

resulting executable le and data structures used during linking b etween ses

sions and uses this information to incrementally up date the executable le in

resp onse to changes in the ob ject co de les See Section for a more thor

ough discussion A hierarchical linker sp eeds up linking by binding together

the ob ject co de les of groups of mo dules to form new ob ject co de les or li

braries Since all crossmo dule references within the mo dules in such a library

are resolved once and for all when the library is created future links which use

the library will b e faster See also Section

We refer to Presser and Beck for early and mo dern views of linking

and loading

Co de Generation and Optimization at Mo dule Binding Time

It has often b een observed that due to a lack of intermodular information

many co de optimization techniques cannot b e successfully applied at compile

time Such intermodular or interprocedural optimization techniques include

interprocedural register allo cation interprocedural constant propagation and

intermodular inlining It is natural to investigate whether these techniques can

instead b e applied at mo dule bindingtime when all the necessary intermodular

Modular Languages and Their Processors

information is available Indeed this is exactly what we will do in Chapters

through

Previous work in this area includes the MIPS compiler suite which

p erforms pro cedure integration and other interprocedural optimizations at

linktime Walls and Santhanams interprocedural

register allo cators and Benitezs optimizing linker which p erforms pro ce

dure call optimizations at linktime In the MIPS system the linker simply

concatenates the intermediate co de les pro duced by various compilers into

one large le Other to ols then p erform optimization and co de generation on

this le and the resulting co de is run through the assembler and system link

editor to form the executable le Walls linker works on ob ject co de les

where the relo cation information has b een extended to include information re

garding variable usage Based on this information the programs callgraph

and collected proling information the linker allo cates frequently used vari

ables to global registers and restructures the co de generated by the compiler

accordingly Benitezs optimizing linker works on machine co de in a register

transfer list RTL representation The linker which is designed to restructure

pro cedure calls to pass arguments in registers and to remove unnecessary test

instructions after pro cedure calls emits assembly co de which is run through

the system assembler and linker to pro duce the executable le Santhanam

also p erforms optimizations at linktime although his translating system do es

not contain an optimizing linker Rather each mo dule is run through the com

piler twice the rst time to gather intermodular information and the second

time to p erform the actual optimizations The ob ject co de les are then linked

as usual with the system link editor

Naturally linkers which have b een extended with functions for optimization

are likely to b e much slower than traditional link editors Benitez rep orts that

his optimizing linker is three to six times slower than the system linker Most

of the time is sp ent in the assembly phase Walls linktime optimizations slow

down the linker by a factor of to In Chapters and we will prop ose

techniques for linktime optimization which have the p otential for making op

timizing linkers as fast as and sometimes faster than traditional link editors

Languagesp ecic Mo dule Binders

So far we have only discussed languageindep endent mo dule binders ie binders

which are designed to handle mo dules pro duced by compilers for many dier

ent languages In this section however we will give a brief overview of some

translating systems where the mo dule binder has b een designed to work with

only one particular language or to handle certain languagesp ecic problems

Translating and Executing Modular Programs

The mo dule binder for the Mesa language can combine mo dules in many

interesting ways However Sweet rep orts that most binding requests are

simply of the kind bind them the mo dules together in the only way p ossible

For more complicated situations Mesa provides a sp ecial conguration language

CMesa CMesa for example allows one to describ e how groups of

mo dules should b e joined into libraries hierarchical linking

Ancona describ es a system whereby Pascal mo dules are b ound at the

source co de level After binding the output of the binder is submitted to a

compiler for monolithic Pascal The binder is able to supp ort a simple form of

generic mo dules

Some sp ecial binders are also designed to p erform intermodular semantic

checking Kieburtz and Celentano b oth describ e binders which

p erform intermodular type checking for mo dular Pascal dialects For a discus

sion of the latter system see Section

Thorelli describ es a binder for mo dules in the language PL

PL mo dules fall under the importbystructure see Section and zeroto

one headings Section and are unlike mo dules in most languages other

than MilanoPascal Section completely autonomous The binder is de

signed to combine separately compiled mo dules according to instructions given

to it in the linking language LL which can describ e mo dule interconnections in

a variety of ways Since no intermodular type checking can b e p erformed at

compiletime this task has to b e deferred to the linker

Loading and Executing Mo dular Programs

Loading and execution are the last translation phases of any mo dular program

and one might exp ect these to b e uncomplicated For the most part this is true

but we will touch on some of the more salient issues

An imp ortant question is what happ ens when a program starts and nishes

executing Many languages allow the programmer to sp ecify actions to b e

taken at one or b oth of these p oints Each Mo dula mo dule for example has

a p ossibly empty module body which is executed after the program has b een

loaded The actions p erformed in mo dule b o dies are usually limited to the

initialization of static variables and if this is the case the order in which the

mo dule b o dies are executed is most often immaterial the main mo dule b o dy

must of course b e executed last However if a mo dule b o dy makes use of

op erations dened in imp orted mo dules it is essential that these mo dules b e

initialized rst The Mo dula standard therefore states that the b o dy of a

client mo dule should b e executed before the b o dies of the mo dules it imp orts

In case of circular dep endencies the order is left unsp ecied

Modular Languages and Their Processors

Similarly Euclid whose mo dules are typed allows initial and nal actions

to b e asso ciated with each mo dule The initial action is executed when a

mo dule variable is created and the nal action is executed when the variable

is destroyed

Ob jectoriented languages classbased languages with inheritance have a

somewhat more complicated runtime b ehavior than ob jectbased languages

There are two main sources of complexity dynamic binding and runtime type

checking As we have already explained ob jectoriented languages allow a cer

tain degree of p olymorphism descendants of a class can b e used wherever the

class itself can b e used they essentially are the class since they fulll the isa

relationship and in some cases it is not p ossible to determine at compile

time whether the assignment of one class instance to another is legal This

check therefore has to b e deferred till runtime where it is p erformed based

on a typetag stored with each instance Ob jectoriented languages also sup

p ort dynamically b ound pro cedures also known as virtual pro cedures or virtual

methods A virtual pro cedure declared in a class can b e overridden by descen

dant classes ie replaced by another pro cedure whose b ehavior is more suited

to the descendants needs Dynamic binding is usually implemented see for ex

ample Harbison pp by storing with each class instance a p ointer

to a method template which is a list of addresses constructed at compiletime

of the class virtual pro cedures All metho d calls are done indirectly through

this template

The nal p oint we will consider is runtime co de generation and optimiza

tion We have already seen that optimization can b e done at compiletime as

well as linktime and it is therefore natural to ask whether there are some

types of optimizations which are b est applied during execution The answer

to this question at least for some types of languages seems to b e yes but

unfortunately there has yet to b e much work done in this area Chambers

describ es a translating system for the SELF language which p erforms runtime

co de generation in order to pro duce customized co de for dierent combinations

of runtime types See also Kepp el for a thorough discussion of runtime

co de generation

Integrating Translation Phases

A obvious metho d for achieving faster turnaround times in program develop

ment environments is to integrate two or more of the translation phases Load

andgo assemblers and compilers for example disp ense with the linking

phase and load the generated co de directly into main memory for execution

The Ob eron system also disp enses with linking altogether Instead

A Survey of Programming Language Module Concepts

mo dules are loaded and b ound together as needed during execution This is

somewhat similar to dynamic linking as describ ed earlier in this section The

advantage is of course faster turnaround time for the programmer not only

is one translation phase completely eliminated but it may also b e p ossible

to change a running program simply by reloading a new version of a changed

mo dule The disadvantage at least for the Ob eron system is a noticeable

p pro cedure call overhead The reason is that an intermodular pro cedure

call must b e preceded by a table lo okup to nd the pro cedures address

Integrated and incremental programming environments such as the Ratio

nal Ada environment and Gandalf integrate editing and all

or most of the dierent stages of translation The result is a translation system

which continuously up dates the executable co de in resp onse to editing changes

See Section for a more detailed discussion

A Survey of Programming Language Mo dule

Concepts

We will next attempt to illustrate the concepts discussed previously in the chap

ter by repro ducing the ubiquitous stack abstraction in ten mo del languages

Ada Beta Eiel Euclid Mesa MilanoPascal Standard ML Mo dula

Mo dula and Zuse The languages have b een chosen so as to illustrate the

tradeos b etween abstraction encapsulation mo dularity and separate compi

lation in programming language design and each example has b een co ded in a

way that we b elieve b est illuminates the languages strengths and weaknesses

Ada

Adas mo dule system is characterized by three things use of realization protec

tion as the prime encapsulation mechanism although Mo dula style p ointer

hiding realization restriction is also supp orted the availability of con

strained as well as unconstrained generics and implementation ordering for

generic mo dules and packages exp orting inline routines

The generic stack denition of Figure illustrates these p oints The real

ization of the stack abstraction is given in the mo dule sp ecication unit under

the private heading The only op erations available to clients on variables of

STACKSTACK are PUSH and POP the declaration limited type GENERIC

private explicitly outlaws assignments and equalitytests which would have

b een legal if STACK had just b een declared private

In general Ada sp ecication and implementation units are compiled sepa

Modular Languages and Their Processors

generic

type ITEM is private

package GENERIC STACK is

type STACK SIZE POSITIVE is limited private

pro cedure PUSH S in out STACK E in ITEM

pro cedure POP S in out STACK E out ITEM

pragma INLINE PUSH POP

private

type VECTOR is array POSITIVE range of ITEM

type STACK SIZE POSITIVE is record

SPACE VECTOR SIZE

INDEX NATURAL

end record

end GENERIC STACK

STACK is package b o dy GENERIC

Implementations of PUSH and POP

end GENERIC STACK

STACK with GENERIC

pro cedure MAIN is

package STACK INT is new GENERIC STACK INTEGER

INTSTACK S STACK

b egin

STACK INTPUSH S

end MAIN

Figure Ada generic stack example adapted from Rogers

rately and changes to implementation units do not aect imp orting mo dules

The Ada reference manual however allows programming environments to re

quire implementation units of mo dules containing exp orted inline routines to

b e compiled prior to the mo dules clients Similarly programming environments

may disallow the separate compilation of sp ecication and implementation units

of generic mo dules

Beta

The language Beta generalizes programming language concepts such as pro

cedure function class and pro cess into one concept the pattern It diers

signicantly from the other languages describ ed in this section particularly

when it comes to separate compilation and mo dularization which is not de

ned within the language prop er Instead a languageindep endent mechanism

A Survey of Programming Language Module Concepts

called a fragment system forms the basis of the separate compilation system

and is used to split programs into mo dules mo dules into sp ecication and im

plementation units and implementation units into variants A fragment is any

legal sequence of terminal and nonterminal symbols which can b e generated

from a nonterminal in the languages contextfree grammar Fragments can b e

textually separate or part of fragment groups and can contain links to other

fragments

ORIGIN b etaenv

BODY stackbo dy

LIB Attributes

Stack

Private SLOT Private descriptor

Push e Integer enter e do SLOT PushBo dy descriptor

Pop e Integer do SLOT PopBody descriptor exit e

New do SLOT NewBo dy descriptor

ORIGIN stack

Private descriptor A Integer Top Integer

PushBo dy descriptor do Co de for Push

PopBody descriptor do Co de for Pop

NewBo dy descriptor do Privatetop

ORIGIN b etaenv

INCLUDE Stack

Programdescriptor S Stack do SNew SPush

Figure Beta stack example adapted from Knudsen

The Beta program in Figure is split into three textually separate frag

ments each starting with the keyword ORIGIN The rst fragment corre

sp onds to the sp ecication unit of the Stack mo dule the second is a fragment

group corresp onding to the Stacks implementation unit and the third fragment

is the main program itself The syntax SLOT PushBo dy descriptor in

the rst fragment serves as a placeholder for a fragment with the name Push

Body of the syntactic category descriptor PushBody is dened in the imple

mentation fragment

Modular Languages and Their Processors

Eiel

Eiels mo dule system is built on the class concept An Eiel class is separately

compiled dynamically or statically allo cated and may inherit as well as imp ort

from other classes Constrained as well as unconstrained generic classes are

supp orted Behavior is sp ecied in the form of pre and p ostconditions and

invariants and is tightly coupled with the exception mechanism

deferred class STACK T exp ort depth empty full push p op

feature

depth INTEGER is deferred end

empty BOOLEAN do Result depth ensure depth end

full BOOLEAN is deferred end

p op T is require not empty deferred

ensure not full depth old depth end

push x T is require not full deferred

ensure not empty depth old depth end

invariant depth

end STACK

class FIXED STACK T exp ort depth empty full push p op

inherit STACK T ARRAY T

feature

Create n INTEGER is do end

redenitions of depth full p op and push using ARRAY

STACK end FIXED

class ROOT feature

STACK INTEGER do SCreate SPush Create is lo cal S FIXED

end

end ROOT

Figure Eiel generic stack example adapted from Meyer

Eiel classes are of the zerotoone kind ie there are no explicit sp ecica

tion units The Eiel programming environment however provides a sp ecial

op eration short which extracts a class sp ecication from its implementa

tion Furthermore deferred classes classes which contain some features which

are not fully realized can b e used to simulate sp ecication units This is an

attractive approach for two reasons a deferred class can b e the ancestor

of several dierent realizations of the same abstraction and the b ehavioral

sp ecication in the deferred class applies to descendant classes also

Figure gives a deferred generic stack denition STACK fol

lowed by an implementation FIXED STACK An alternative implementation

A Survey of Programming Language Module Concepts

LINKED STACK for example would also inherit from STACK FIXED STACK

uses an array realization and therefore inherits from the ARRAY class as well

as from STACK The preconditions the require clauses p ostconditions en

sure clauses and invariants of the features of STACK apply to the corresp ond

STACK as well ing features in FIXED

Euclid

Euclid was designed with verication in mind and the language includes asser

tions pre and p ostconditions and invariants along the same lines as Eiel

Euclid is classbased mo dules are typed and mo dules can b e parameterized

with compiletime expressions

type IntStack size unsignedInt mo dule exp orts Push Pop

type range size

var store array range of signedInt

var index range

inline pro cedure Push E signedInt pre index size b egin end Push

inline pro cedure Pop var E signedInt pre index b egin end Pop

initially index

invariant index and index size

end IntStack

include IntStack

type Main mo dule var S IntStack

initially b egin SPush end

end Main

Figure Euclid parameterized stack example Postconditions have b een

omitted

Euclid mo dules are of the zerotoone kind and may but are not required

by the Euclid rep ort to b e separately compiled Figure gives a Euclid

stack mo dule of clientdened size As in the Mo dula example Figure

generic mo dules may b e constructed using lowlevel facilities

Mesa

The Mesa mo dule system is of the manytomanykind ie the items dened by

a particular sp ecication unit can b e realized by any one of the implementation

units in the program A sp ecial mo dule binding language CMesa is used

to describ e the connections b etween the mo dules in a program

Modular Languages and Their Processors

Stack DEFINITIONS

BEGIN

T TYPE

PT TYPE LONG POINTER TO T

Init PROC S PT

Push PROC S PT E INTEGER

Pop PROC S PT RETURNS INTEGER

END

StackImpl PROGRAM EXPORTS Stack

BEGIN

T PUBLIC TYPE RECORD

space ARRAY OF INTEGER

index

Implementations of Init Push and Pop

END

Main PROGRAM IMPORTS Stack

BEGIN S StackT StackInit S StackPush S END

Figure Mesa stack example in the denition of T refers to T s

size StackInit S passes the address of S to Init This construction must

b e used since Mesa only has passbyvalue parameters

Two kinds of hidden opaque types are provided a Mo dulastyle hidden

p ointer type and a hidden type whose size is revealed in the sp ecication unit

In addition to this types or parts of structured types can b e protected by

a PRIVATE clause which prevents clients from accessing the types internal

structure The SHARED clause can b e used by clients to override such access

restrictions Inline pro cedures are also available in Mesa but the co de of an

exp orted inline pro cedure must b e revealed in the sp ecication unit Mesa thus

supp orts four dierent types of encapsulation realization restriction realization

protection and partial revelation for abstract types and realization revelation

for inline pro cedures

The stack denition in Figure shows the use of opaque types with re

vealed size The main advantage of revealing the size is that variables of the

type can b e statically allo cated and hence do not tax the dynamic allo cation system

A Survey of Programming Language Module Concepts

MilanoPascal

The Pascal dialect developed at the Politechnico di Milano henceforth Milano

Pascal is interesting for several reasons it is one of a few mo dular languages

which explicitly requires a nonstandard translation system it supp orts fully

abstract types and constants it provides completely autonomous mo dules and

it allows nesting of separately compiled mo dules The main asp ect that sepa

rates MilanoPascal from the other languages in this survey however is that

clients sp ecify the structure of the items they need to imp ort rather than as is

more common the name of the mo dule which should provide these items

Stack mo dule

const stack size

int stack record space array stack size of integer type

index integer

end

pro cedure init var S int stack b egin end

pro cedure push var S int stack E integer b egin end

pro cedure p op var S int stack var E integer b egin end

endmo dule

Main mo dule

interface sp ec

const stack size

type int stack

init var S int stack pro cedure

pro cedure push var S int stack E integer

pro cedure p op var S int stack var E integer

end interface sp ec

S int stack var

b egin

init S push S

endmo dule

Figure MilanoPascal stack example adapted from Celentano

Consider the program in Figure The mo dule Main needs a stack and

therefore indicates in its interface spec that it must b e b ound in an environment

which contains a constant stack size a type int stack and three pro cedures init

push and pop with formal parameters of the sp ecied types and mo des In other

words Main do es not imp ort a Stack mo dule but rather indicates that there

must exist in its environment items which satisfy the prop erties of a stack

Modular Languages and Their Processors

It follows that mo dules Stack and Main are completely indep endent of each

other and can b e compiled in an arbitrary order Intermodular type checking

is done at linktime rather than compiletime by a languagesp ecic linker The

input to the linker is a parenthesized expression of mo dule names describing the

nesting of the mo dules In addition to static checking the linker has to allo cate

variables whose size cannot b e determined at compiletime for example S in

mo dule Main Figure

Standard ML

ML is in contrast to the other languages studied here a functional language

ML is strongly typed and has a p olymorphic type system Most ML implemen

tations are interactive in nature ie the system accepts an expression as input

evaluates it and prints the result In the ML version of our stack example

Figure the signature serves as the stacks sp ecication introducing the

stack type T an exception E which is raised by pop and top when the stack is

empty and functions op erating on T The a T idiom indicates that we are deal

ing with a p olymorphic stack one which can op erate on any type of element

The implementation of the stack is given in the abstraction where each of

the items introduced in the signature receives a representation The use of

abstraction indicates that a client of Stack has no access to its representa

tion Exchanging structure for abstraction would make the representation

available for clients to manipulate Other asp ects of the ML mo dule system

will have to b e omitted for the sake of brevity

Mo dula

Mo dula mo dules are of the onetoone kind and type abstraction is through

realization restriction opaque types are restricted to p ointers There is no

linguistic supp ort for generic mo dules but untyped generics can b e programmed

by use of lowlevel facilities An example of this is shown in Figure It is

also interesting to note the dierence b etween this example and the previous

one in Mesa given in Section The Mesa stack was an opaque type with

revealed size and could therefore b e allo cated statically whereas the Mo dula

stack is an opaque p ointer type and therefore needs explicit routines for dynamic

creation and destruction

Mo dula

Mo dula extends Mo dula and Mesa with features for ob jectoriented pro

gramming It is an interesting language since it breaks with many of the

A Survey of Programming Language Module Concepts

structure Main struct signature STACK sig

local open Stack in type a T

val S push empty exception E

val r top S val empty a T

val S pop S val push a T a a T

end val pop a T a T

end val top a T a

end

abstraction Stack STACK struct

datatype a T Stack of a list

exception E

val empty Stack

fun pushStackS x StackxS

fun popStackxS StackS popStack raise E

fun topStackxS x topStack raise E

end

Figure Standard ML stack example

traditions kept by the other languages in the family of Pascal descendants

Type compatibility is by structure rather than by name dynamic storage is re

claimed automatically the mo dule system is of the manytomany variety and

the encapsulation concept is such that it requires extensive p ostcompilation

pro cessing The last p oint is of particular interest to us in this thesis since the

language which we will describ e in Chapter has similar problems only more

so We will therefore study in some detail the requirements which Mo dula

imp oses on its translating systems

Consider the Mo dula stack abstraction in Figure It consists of the

two interfaces Stack and StackRep an implementation unit StackImpl and a

client mo dule StackUse The Stack interface states that the Stack type StackT

is a subtype of ROOT which reveals that StackT must b e realized as a an ob

ject type Ob ject types form Mo dulas basis for ob jectoriented programming

they can b e related by aggregation as well as by inheritance and are usually

implemented as a p ointer to a record containing the ob ject types data elds as

well as a p ointer to its template The template is essentially a constant record

of p ointers to the ob ject types pro cedures The StackRep interface reveals fur

ther information ab out StackT namely that it is a subtype of StackRepLink

In other words the StackT realization must at least have two elds next

Modular Languages and Their Processors

DEFINITION MODULE GenericStack

IMPORT SYSTEM

TYPE Stack

PROCEDURE Create Stack

PROCEDURE Destroy VAR S Stack

PROCEDURE Push S Stack E SYSTEMADDRESS

PROCEDURE Pop S Stack VAR E SYSTEMADDRESS

END GenericStack

IMPLEMENTATION MODULE GenericStack

IMPORT SYSTEM Storage

TYPE Stack POINTER TO RECORD

space ARRAY OF SYSTEMADDRESS

index CARDINAL

END

Implementations of Create Destroy Push and Pop

END GenericStack

MODULE Main

IMPORT GenericStack Storage

VAR S GenericStackStack

E POINTER TO INTEGER

BEGIN

S GenericStackCreate

NEW E E GenericStackPush S E

GenericStackDestroy S

END Main

Figure Mo dula generic stack example

and item Finally StackImpl reveals that StackT is a direct subtype of Stack

RepLink and that in addition to the elds StackT receives from StackRepLink

the complete realization also includes a eld depth

We see from the example in Figure that Mo dula supp orts partial

revelation and multiple views of ob ject types A client of the Stack abstraction

only needs to see the abstract Stack interface whereas an implementer also

needs the information in StackRep

What is particularly interesting in the light of this thesis is that partial

revelation makes problems for translating systems for Mo dula Consider

for example a mo dule M which declares a subtype MT of StackT Further

more assume that MT also declares a data eld f Since the most ecient

implementation of single inheritance requires that an ob ject types lo cal data

immediately follow the data of its sup ertypes the Mo dula compiler needs

A Survey of Programming Language Module Concepts

INTERFACE Stack

TYPE T ROOT

PROCEDURE Create T

PROCEDURE Push VAR s T e INTEGER

PROCEDURE Pop VAR s T INTEGER

PROCEDURE Depth s T INTEGER

END Stack

INTERFACE StackRep IMPORT Stack

TYPE Link OBJECT item INTEGER next StackT END

REVEAL StackT Link

END StackRep

MODULE StackImpl EXPORTS Stack IMPORT StackRep

REVEAL T StackRepLink BRANDED OBJECT depth INTEGER END

Implementations of Create Push Pop and Depth

BEGIN END StackImpl

MODULE StackUse EXPORTS Main IMPORT Stack

VAR S StackT D INTEGER

BEGIN

S StackCreate StackPush S D StackDepth S

END StackUse

Figure Mo dula stack example adapted from Nelson Since

Mo dula supp orts garbage collection no explicit deallo cation routine needs

to b e called

to know the size of StackT s data record in order to compute the oset of f

in MT s data record However even if the compiler could determine by in

sp ection of StackRep that StackT contains the elds next and item it could

not know that it also contains the eld depth since this fact is hidden within

StackImpl Hence there are cases when a Mo dula translating system cannot

at compiletime compute the osets of record elds SRC Mo dula currently

the only available Mo dula implementation solves this by p erforming these

computations at logical linktime ie at runtime prior to the execution of

any user co de This pass which also includes checks for illegal revelations and

the construction of ob ject type templates can b e exp ensive in itself but the

scheme also in general slows down runtime eld accesses since these have to

b e preceded by a table lo okup to determine actual eld osets

In addition to what we have seen from Figure Mo dula supp orts a

simple form of unconstrained generic mo dules threads lightweight pro cesses

Modular Languages and Their Processors

and exceptions

Zuse

We will conclude this survey with a sneak preview of the language Zuse which

will b e introduced in detail in the next chapter In short Zuse like Mo dula

and Mo dula has fully abstract types but unlike Mo dula and Mo dula

Zuse do es not p ose any restrictions on the realizations of such types Further

more Zuse like Ada supp orts abstract inline pro cedures but unlike Ada

do es not allow compilation dep endencies b etween implementation units The

eect of these features on a real example can b e seen from the Zuse Stack

example in Figure The stack sp ecication unit do es not leak any informa

tion regarding the realization of the abstract type nor the inlinestatus of the

exp orted pro cedures and once the Stack sp ecication unit has b een compiled

the corresp onding implementation unit and the client mo dule StackUse can b e

compiled in an arbitrary order Furthermore since StackT is realized as a

statically allo cated type its use will not tax the dynamic allo cation system at

all

Zuse also includes features for ob jectoriented programming similar to those

found in Mo dula However unlike the SRC Mo dula implementation dis

cussed in Section the Zuse translating systems presented in Chapters

through assure that these features do not degrade the runtime p erformance

of ob jectoriented programs

Summary

In this chapter we have examined some of the mo dule and abstraction concepts

which have b een prop osed for imp erative and ob jectoriented languages We

have paid particular attention to how dierent languages have approached the

problems inherent in the combination of encapsulation and separate compila

tion We have shown that most language designs compromise orthogonality

of encapsulation in favor of ease of implementation and sometimes at the ex

p ense of reduced runtime eciency We summarize our ndings in the table

on page The abbreviations used in the table are explained at the b ottom

of page

Summary

SPECIFICATION Stack

TYPE T f f g g

CONSTANT MaxDepth CARDINAL

PROCEDURE Create RETURN T

PROCEDURE Push REF S T e CARDINAL

PROCEDURE Pop REF S T RETURN CARDINAL

PROCEDURE Depth S T RETURN T

END Stack

IMPLEMENTATION Stack

CONSTANT MaxDepth CARDINAL

TYPE Vector ARRAY RANGE CARDINAL MaxDepth CARDINAL

TYPE T f f g g RECORD S Vector D CARDINAL

PROCEDURE Create RETURN T

INLINE PROCEDURE Push REF S T e CARDINAL

INLINE PROCEDURE Pop REF S T RETURN CARDINAL

PROCEDURE Depth S T RETURN T

END Stack

PROGRAM StackUse

IMPORT Stack

VARIABLE S StackT

BEGIN S StackCreate StackPush END StackUse

Figure Zuse stack example Since StackT is a static type no explicit

deallo cation routine needs to b e called

rres realization restriction rrev realization revelation

rpro realization protection prev partial revelation

iord implementation ordering bsup bindingtime supp ort

rsup runtime supp ort

ob ob jectbased cb classbased

oo ob jectoriented

ugen unconstrained generic cgen constrained generic

ip ol inclusion p olymorphism param parametric

onetoone zerotoone

mm manytomany onetoatmostone

sp ec sp ecication impl implementation

ipro c inline pro cedure mo d mo dule

stat static allo cation of abstract dyn untraced dynamic allo cation

types of abstract types

GC traced dynamic allo cation of

abstract types

Modular Languages and Their Processors references impl impl impl impl impl impl impl impl pro c item item units comp sp ec pro c sp ec any any mo d mo d mo d classes unsp ecied classes mo d mo d unsp ecied sp ec sp ec sp ec sp ec mo d mo d none sp ec sp ec pro c ip ol ugen ip ol types types classes mo d poly mo d mo d mo d ip ol morphism cgen param ip ol cgen ip ol cgen ugen pro c cgen param ip ol param ugen ip ol ip ol ip ol ip ol classes dyn GC dyn dyn GC dyn GC GC alloc stat GC stat stat stat GC stat GC stat GC stat stat GC dyn dyn dyn stat GC GC stat stat stat iord rrev rrev rrev rrev rrev bsup iproc encapsulation rpro rpro rrev rpro rpro rrev bsup rrev rrev rrev rrev rrev rrev rrev rrev rrev rpro bsup const rpro rres rsup rpro rrev rpro rrev rpro rpro rpro rrev prev rres rpro bsup rres rres rres rres rsup prev rrev rrev rrev rrev rpro rpro bsup rres type mm mm mm system module ob oo cb oo ob cb oo oo cb oo ob ob ob ob ob oo oo oo oo ob ob oo ML Plus language Ada Alb ericht Alphard Beta CHILL CLU C Eiel Euclid LOGLAN Mesa MilanoPascal Standard Mo dula Mo dula Mo dula Ob eron Ob eron Ob jectOb eron Turing Turing Zuse

Chapter

A Language with Flexible

Encapsulation

b ondageanddiscipline language A language

   that though ostensibly generalpurpose is

designed so as to enforce an authors theory of

right programming   

The New Hackers Dictionary

Introduction

o dern programming languages display an amazing array of features

in supp ort of abstraction and encapsulation The previous chapter

some of the mechanisms which have b een prop osed and

M examined

discussed the p enalties they introduce with resp ect to translation

and execution time This chapter will introduce an exp erimental language de

sign a close descendant of earlier languages such as Ada Mesa Mo dula and

1

Mo dula with orthogonal and exible facilities for hiding representational

details of exp orted items Subsequent chapters will discuss necessary and de

sirable prop erties of translators for this language and will closely examine four

concrete translating system designs

This section will give some background information and state the rationale

b ehind the language design Sections through will present an informal

account of the facilities provided by the language with resp ect to the exible way

in which representational details of exp orted items may b e hidden or revealed

1

Principle of orthogonality language features can b e comp osed in a free and uniform

manner with predictable eects and without limitations

A Language with Flexible Encapsulation

Section will examine several Zuse programs as examples of how the dierent

kinds of encapsulation provided by the language can b e put to use

Language Design Rationale

The main ob jective of this treatise is to examine the tradeos b etween language

facilities for abstraction and encapsulation and the p enalties in translation and

executiontime they may incur Mainly we will b e interested in the way infor

mation ab out items exp orted by a mo dule may b e divided b etween the mo dules

sp ecication and implementation parts and how this might aect the design of

a translating system for the language To this end the remainder of this chap

ter will examine an exp erimental mo dular language with exible encapsulation

2

facilities and which is tentatively named Zuse The language design has b een

guided by these four general principles

The language should include facilities for making any exp orted item fully

abstract semiabstract or concrete In other words it must b e p ossible

to defer all none or some of the representational details of any kind of

exp orted item to the implementation unit Apart from b eing basic to

sound software engineering practices full encapsulation is also essential

in order to avoid trickledown recompilations

The cost in terms of translationtime executiontime and storage of

using an abstract or semiabstract item should b e no greater than it would

have b een if the same item had b een concrete

The language design should not allow compilation dep endencies to exist

b etween mo dule implementation units and should furthermore require the

bulk of intra and intermodular static semantic checking to b e p erformed

at compiletime These rules minimize the risk of very long compilation

and execution times and are also essential in order to allow programmers

to work indep endently in team programming situations

The language should allow a high degree of freedom in the way exp orted

items are realized including the use of lowlevel facilities It should al

ways b e p ossible however to ensure the integrity of exp orted items by

protecting them from abuse by imp orters

The rst three items on this list may b e paraphrased Full and exible en

capsulation should not incur any translationtime or executiontime overhead

2

for Konrad Zuse the rst mo dern programming language designer see Zuse and

Bauer

The Zuse Module System and Concrete Export

We b elieve this to b e an imp ortant principle and it will p ermeate this entire

thesis Our reason for b eing very rm on this p oint is that full encapsulation

do es not make a language more expressive ie it do es not allow one to express

designs that could not have b een expressed in a language without encapsula

tion only safer Translationtime or executiontime overhead induced solely

by a languages facilities for encapsulation is therefore unacceptable

Providing BindingTime Information

Subsequent chapters will examine in detail p ossible translation systems for

Zuse but we will oer some preliminary information here in order to pro

vide a framework in which to discuss the kinds of constructs that the language

design may allow

The language design which will b e presented in this chapter implicitly rests

on the assumption that the standard to ols requirement Section has b een

relaxed ie that nonstandard translation systems are deemed acceptable

Sp ecically we will assume that the translation of a set of source co de mo dules

into an executable program go es through a stage temp orally lo cated after

the compilation of all mo dule implementations but prior to program execu

tion during which intermodular information may b e exchanged This stage

which we will call the pasting stage may either co exist with existing system

linkeditors and linkingloaders or completely or in part take over the tasks

traditionally p erformed by these systems programs see Figure In other

words a translating system for Zuse is assumed to fall under the bindingtime

supp ort heading in the list of approaches to encapsulation given in Section

The exact division of lab or b etween compilers pasters the program active

during the pasting stage linkers and loaders will not b e dened here but

will b e the sub ject for later discussion For now it suces to restate the third

language design principle ab ove namely that a compiler for the Zuse language

is assumed to p erform the bulk of syntactic and semantic analysis In other

words the static semantic correctness of a Zuse program should b e decided

prior to the program b eing submitted for pasting

The Zuse Mo dule System and Concrete Exp ort

Zuse contains the array of basic concepts found in most mo dular imp erative

and ob jectoriented languages and shares with these languages a common view

of data and execution Our exp osition in this and the following sections will

therefore fo cus on the area in which Zuse diers from similar languages namely

the facilities available for the protection of the integrity of exp orted items We

A Language with Flexible Encapsulation

Source

Compiled Executable

Program

Modules Code

C

P

O

A

M

S

P

T

I

E

L

R

E

R

L I N K

E D I T O R

Figure Prop osed translation system

will have little to say ab out other asp ects of the language such as the control

structures primitive types and standard interfaces available and will simply

assume that the constructs necessary for systems programming are available

in some form or another We will furthermore not b e concerned with the dy

namic semantics of the language unless aected by the prop osed encapsulation

schemes

Nevertheless in this section we will provide some preliminary information

ab out the Zuse mo dule and type systems to serve as background material to

the following sections which will discuss the encapsulation system

Mo dules and Separate Compilation

A Zuse mo dule is untyped and made up of two units the specication and the

implementation There is one program mo dule consisting only of one imple

mentation unit and whose mo dule b o dy is the last to b e executed Zuses

mo dule system is hence of the onetoone kind and identical to that found in

Mo dula

A mo dule may imp ort items from other mo dules by listing the names of the

imp orted mo dules in imp ortclauses at the top of a mo dule unit Individual

items in the imp orted mo dules are referred to using the notation Mv where M

is a mo dule and v is an item exp orted by M Figure b elow shows a program

mo dule P imp orting a separately compiled mo dule M

The two mo dule units are compiled separately with the sp ecication com

piled b efore the implementation Furthermore a unit imp orting a particular

The Zuse Module System and Concrete Export

mo dule will have to b e compiled after the sp ecication part of that mo dule has

b een compiled These are the only rules governing the order in which mo dule

units are compiled Consequently once all the sp ecication units of the mo dules

making up a program have b een compiled the corresp onding implementation

units may b e compiled in an arbitrary order Further changes which do not

mo dify any sp ecication unit only result in the recompilation of the aected

implementation units

PROGRAM P SPECIFICATION M IMPLEMENTATION M

IMPORT M VARIABLE v INTEGER BEGIN

BEGIN END M v

Mv END M

END P

Figure A simple Zuse program consisting of the program mo dule P and

the separately compiled mo dule M

Concrete Types

Concrete items are characterized by the fact that their denition as well as

their realization are contained in the sp ecication unit of a mo dule We will

now introduce the Zuse type system and the syntax and informal semantics of

concrete exp ort of types constants and pro cedures

Zuse shares most of its basic type system with Mo dula there are equiva

lence types enumeration types subrange types records arrays sets p ointers

and pro cedure types The ob ject type is inherited from Mo dula The con

crete syntax of the dierent types is given in Figure The concreteness of

the types is evident from the use of the sp ecial immutable denition symbol

which signies that the realization as it is given in the sp ecication unit

is complete Note that an ob ject type comes in three parts an optional su

p ertype preceded by the keyword WITH a sequence of elds the rst set of

brackets and a sequence of metho ds second set of brackets

Type compatibility is by name except for subrange types which are compat

ible if their parent types are compatible pro cedure types which are compatible

if the type order and mo de of their formal parameters are compatible and

ob ject types which are compatible along their chain of inheritance

In addition to the types inherited from Mo dula and Mo dula Zuse in

cludes a sp ecial UNIQUE type similar to Adas new type but with a somewhat

dierent semantics Consider UniqueT exp orted by ConcreteT in Figure

A Language with Flexible Encapsulation

SPECIFICATION ConcreteT

TYPE

EquivT CHAR

EnumT ENUM Red Blue Green

RangeT RANGE EnumT EnumTRed EnumTBlue

RecordT RECORD xy BOOLEAN

ArrayT ARRAY RangeT CHAR

SetT SET EnumT

PointerT REF RecordT

Pro cT a EnumT REF b ARRAY OF SetT

FunT REF x RangeT RETURN CARDINAL

ObjectT OBJECT

a INTEGER c CHAR

P x EnumT

SubT WITH ObjectT OBJECT

b BOOLEAN

Q z CHAR

UniqueT UNIQUE CARDINAL

END ConcreteT

Figure Zuse concrete types

Inside the mo dule ConcreteT UniqueT b ehaves exactly like a CARDINAL in

other words as if it had b een an equivalence type whereas to an imp orter

ConcreteTUniqueT is a type compatible only with itself In other words an

imp orter may add two variables of type ConcreteTUniqueT but may not add a

variable of type ConcreteTUniqueT to a variable or literal of type CARDINAL

Unique types are mostly used in conjunction with type protection which will

b e describ ed in Section

The syntax and semantics of basic op erations on expressions and designa

tors parallel those in comparable languages In addition to arithmetic op er

ations on variables of arithmetic types eld selection on records and ob jects

RecordName FieldName indexing on arrays ArrayName ExpressionList etc

basic Zuse op erations include bitwise assignment comparison inequal

3

ity and typeco ercion TypeName Expression These op era

tions are allowed on variables which are of an imp orted concrete type as well

as on variables of a lo cally declared type

3

T x is a designator when x is a designator and en expression when x is an expression

T x is legal only when the representations of T and x have the same size

The Zuse Module System and Concrete Export

Concrete Constants

Zuse supp orts manifest named constants in a style similar to Mo dula and

Ada ie scalar as well as structured constants are allowed Unlike these lan

guages however Zuse also supp orts reference constants ie p ointers to con

stant values These are mainly used in conjunction with abstract types All

named constants must b e given an explicit type and concrete constants are

signied by the immutable denition symbol Figure gives a few ex

amples

SPECIFICATION ConcreteC

IMPORT ConcreteT

CONSTANT

ScalarC CARDINAL

RecordC ConcreteTRecordT RECORD

x TRUE

y FALSE

PointerC ConcreteTPointerT REF RECORD

x FALSE

y TRUE

ArrayC ConcreteTArrayT ARRAY

ConcreteTEnumT Red R

ConcreteTEnumT Blue B

SetC ConcreteTSetT SET ConcreteTEnumTRed

END ConcreteC

Figure Zuse concrete constants

Concrete Pro cedures

A Zuse pro cedure declaration is a simple variant of a named constant declara

tion where the type which may b e named or anonymous is the pro cedures

signature the marks concreteness and the value of the pro cedure is its

co de and lo cal data The presence or absence of the keyword INLINE indicates

the kind of linkage which should b e employed for calls to the pro cedure see Fig

ure The transfer mo des available for actual parameters are passbyvalue

and passbyreference REF

Zuse pro cedures may b e nested but a pro cedure may only reference its own

formal parameters variables declared lo cally and global variables Recursion

is disallowed for inline pro cedures b oth directly as well as indirectly through

other inline pro cedures

A Language with Flexible Encapsulation

SPECIFICATION ConcreteP

IMPORT ConcreteT

PROCEDURE P ConcreteTFunT

BEGIN

x ConcreteTEnumTRed

RETURN ORDx

END P

INLINE PROCEDURE P

a ConcreteTEnumT

REF b BOOLEAN

BEGIN

b a ConcreteTEnumT Red

END P

END ConcreteP

Figure Zuse concrete pro cedures

Abstract Exp ort

In accordance with the design rationale on page Zuse provides full encapsu

lation of all items types constants and inline pro cedures Unlike other lan

guages Zuse do es not allow the integrity of encapsulated items to b e compro

mised in any way In other words none of the approaches to encapsulation see

Section employed by previous language designs are allowed to restrict the

full encapsulation capabilities of Zuse partial revelation Mesa and Ob eron

abstract types realization restriction Mo dula abstract types runtime sup

port Mo dula abstract ob ject types and implementation ordering Ada inline

pro cedures are hence all ruled out

Abstract Types

The statically allo cated hidden type is the most imp ortant concept introduced

in Zuse Figure shows the declaration of a hidden type with a few dierent

p ossible realizations The mutable denition symbol is used to signify that

the declaration of T in the sp ecication is incomplete the extension sym

b ol indicates that a declaration in an implementation unit extends a previous

one in the sp ecication

The op erations allowed on a variable of an imp orted hidden type are the

same as those for concrete types with the exception of op erations which require

knowledge of the structure of the realization of the type such as eld selection

SemiAbstract Export

IMPLEMENTATION AbstractT SPECIFICATION AbstractT

TYPE TYPE

T INTEGER T

or VARIABLE

T RECORD ab INTEGER v T

or END AbstractT

R RANGE CHAR A Z

T ARRAY R CHAR

or

T REF INTEGER

END AbstractT

Figure A Zuse mo dule exp orting a hidden type with four p ossible realiza

tions

and array indexing Consequentially it is legal for an imp orter of a hidden type

to test two variables of the type for equality etc are also allowed as

well as to co erce a variable of the type into a variable of another type sub ject to

the restriction that the two variables are of equal size The section on Protected

Types b elow will discuss these matters further

Abstract Constants

A hidden constant Figure is declared much in the same way as a hidden

type the sp ecication unit denes the constants name and type whereas the

realization value is deferred to the implementation unit Note that the type

of a hidden constant can itself b e either abstract or concrete Figure also

explains why Zuse in contrast to most other languages allows manifest p ointer

constants without them an abstract type would b e restricted to nonp ointer

realizations if it was accompanied by an abstract constant

Abstract Pro cedures

There are two asp ects of a Zuse pro cedure which may b e hidden the pro ce

dures b o dy and its linkage inlinestatus Figure is a variant of the mo dule

ConcreteP where b oth pro cedures co de and inlinestatus have b een hidden

SemiAbstract Exp ort

For reasons of program logic or translation eciency it may sometimes b e

desirable to make some parts of the realization of an exp orted item known to

A Language with Flexible Encapsulation

IMPLEMENTATION AbstractC SPECIFICATION AbstractC

CONSTANT Pure REAL CONSTANT

Pure REAL

TYPE

T RECORD ab INTEGER TYPE

CONSTANT T

S T RECORD a b CONSTANT

or S T

TYPE T REF INTEGER END AbstractC

CONSTANT S T REF

END AbstractC

Figure Zuse hidden constants

IMPLEMENTATION AbstractP SPECIFICATION AbstractP

IMPORT ConcreteT IMPORT ConcreteT

PROCEDURE P ConcreteTFunT PROCEDURE P ConcreteTFunT

BEGIN as before END P

PROCEDURE P

INLINE PROCEDURE P a ConcreteTEnumT

a ConcreteTEnumT REF b BOOLEAN

REF b BOOLEAN END AbstractP

BEGIN as before END P

END AbstractP

Figure Zuse abstract pro cedures

users of the item while keeping other parts private to the implementation For

this reason Zuse supp orts semiabstract constants and pro cedures and two

kinds of semiabstract types The realization of a semiabstract item comes

in two parts the visible part which is given in the sp ecication unit and the

hidden part which is deferred to the implementation unit

SemiAbstract Types

Semiabstract types come in two avors semihidden and extensible A semi

hidden type is a hidden type whose denition consists of an indication of the

typeclass to which a realization of the type must b elong eectively restricting

the range of p ossible realizations A semiabstract extensible type on the other

hand is a type where the realization as it is given in the mo dule sp ecication

SemiAbstract Export

may b e extended in the implementation unit

There are four Zuse semihidden types hidden p ointer subrange ob ject

and enumerable types A hidden p ointer type which is equivalent to Mo dulas

and Mesas opaque type and Adas deferred access type may only b e realized

by a p ointer or address type A hidden ob ject type is equivalent to Mo dulas

opaque ob ject type A hidden subrange type reveals its type in the sp ecication

unit but defers its range to the implementation A hidden enumerable type may

b e realized only by an ordinal type ie by any of the builtin enumerable types

integer character b o olean etc by enumeration types or by subranges of

these types Figure gives examples of semihidden types

There are four extensible types record enumeration ob ject and subrange

types An extensible record which is similar to Ob erons public pro jection

type may reveal some elds in the sp ecication unit and may extend these

with additional elds in the implementation unit Similarly an implementation

unit may add identiers to an extensible enumeration type or widen the range

of an extensible subrange type The elds metho ds and the sup ertype of an

extensible ob ject type may all either b e revealed or hidden Note however

that an ob ject may b e given a sup ertype either in the sp ecication or the

implementation part of a mo dule but not b oth

An imp orter of an extensible type is restricted to use only the comp onents

of the type which were given in the sp ecication unit although any variable of

an extensible type will in actuality have all comp onents of the complete type

Figure gives examples of extensible types

One should note an imp ortant dierence b etween the two kinds of semi

abstract types a semihidden type is required to receive a complete realization

in the implementation unit whereas an extensible type may b e extended in the

implementation

SemiAbstract Constants

The value of Zuse structured constants may b e arbitrarily divided b etween

the sp ecication and implementation units Note that the type of an extensible

constant may itself b e extensible or concrete Figure gives examples of

extensible constants

SemiAbstract Pro cedures

In Zuse the co de of a pro cedure cannot b e divided arbitrarily b etween the

sp ecication and implementation unit the way for example the value of an

extensible structured constant may Rather the b o dy of an exp orted pro cedure

A Language with Flexible Encapsulation

SPECIFICATION SemiHidden

TYPE

EnumT

RangeT RANGE INTEGER

PointerT REF

ObjectT OBJECT

VARIABLE

v EnumT

END SemiHidden

IMPLEMENTATION SemiHidden

IMPORT SYSTEM

TYPE

EnumT INTEGER

RangeT RANGE INTEGER

PointerT REF INTEGER

ObjectT OBJECT a INTEGER

or

EnumT RANGE CHAR A Z

RangeT RANGE INTEGER

PointerT SYSTEMADDRESS

or

EnumT ENUM Red Blue Green

RangeT RANGE INTEGER

PointerT REF RECORD

ab EnumT

x PointerT

END SemiHidden

PROGRAM SemiHiddenImp ort

IMPORT SemiHidden

VARIABLE

S SET SemiHiddenEnumT

R ARRAY SemiHiddenRangeT BOOLEAN

BEGIN

INCL S SemiHiddenv

R TRUE

END SemiHiddenImp ort

Figure Zuse semihidden types

SemiAbstract Export

SPECIFICATION ExtendedT

TYPE

EnumT ENUM Red Blue Green

RangeT RANGE CARDINAL

RecordT RECORD xy EnumT

ObjectT OBJECT

v INTEGER

R x CHAR

END ExtendedT

IMPLEMENTATION ExtendedT

IMPORT ConcreteT

TYPE

EnumT ENUM Yellow Orange

RangeT RANGE CARDINAL

RecordT RECORD z REAL

ObjectT WITH ConcreteTSubObjectT OBJECT

r RangeT

U z CHAR

VARIABLE X SET EnumT

Y ARRAY RangeT CARDINAL

Z RecordT

BEGIN

INCL X EnumT Orange

Y

Z:z

END ExtendedT

PROGRAM P

IMPORT ExtendedT

VARIABLE A SET ExtendedTEnumT

B ARRAY ExtendedTRangeT CARDINAL

C ExtendedTRecordT

BEGIN

legal INCL A ExtendedTEnumTGreen

B

C:x ExtendedTEnumT Red

illegal INCL A ExtendedTEnumT Orange

B

C :z

END P

Figure Type extension examples A client of mo dule ExtendedT has no

knowledge of the extended parts of the types and consequently cannot reference them

A Language with Flexible Encapsulation

IMPLEMENTATION ExtendedC SPECIFICATION ExtendedC

TYPE TYPE

RecordT RECORD y INTEGER RecordT RECORD x CHAR

CONSTANT SetT SET CHAR

R RecordT RECORD y CONSTANT

S SetT SET a z R RecordT RECORD x R

END ExtendedC S SetT SET A Z

END ExtendedC

Figure Zuse extensible constants

must b e given in its entirety in one of the units However Zuse allows the

linkage of an otherwise hidden pro cedure to b e either hidden or revealed see

Figure An exp orted pro cedure with a hidden b o dy but revealed linkage

will b e known as a semihidden pro cedure

SPECIFICATION SemiHiddenP IMPLEMENTATION SemiHiddenP

IMPORT ConcreteT IMPORT ConcreteT

NOT INLINE PROCEDURE P ConcreteTFunT

PROCEDURE P ConcreteTFunT BEGIN as before END P

INLINE PROCEDURE P INLINE PROCEDURE P

a EnumT a EnumT

REF b BOOLEAN REF b BOOLEAN

END SemiHiddenP BEGIN as before END P

END SemiHiddenP

Figure Zuse semiabstract pro cedures P may not b e realized as an

inline pro cedure whereas P must b e

Automatic Initialization

A Zuse type may b e coupled with a default value a constant expression

which will b e the initial value of any variable of that type Automatic variable

initialization of this kind is a convenient concept supp orted with some variety

in languages such as Ada Mesa and Mo dula which relieves programmers

of the burden of manual initialization The examples in Section will make

it evident that without automatic initialization Zuses extensible types would

Type Protection

b e rendered virtually useless

Zuses initialization concept is built on the one found in Mesa pp

with the extension that initialization clauses may b e hidden or

partially revealed An exp orted concrete or semiabstract type can b e given a

concrete default value using the immutable default symbol or an abstract

or semiabstract default value using the extensible default symbol A default

value is extended in the implementation unit using the extension default symbol

Three additional rules determine the legality of an initialization clause

A type or a comp onent of a type which has b een given a default value

in the sp ecication unit may not b e given a dierent default value in the

implementation

In the absence of an overriding initialization clause equivalence and sub

range types inherit the default values of their base types

The default value of a structured type is the combination of the default

values of the comp onents

Figure gives some examples of legal initialization clauses

Type Protection

Further discussions in subsequent chapters will make it clear that the indiscrim

inate use of Zuse abstract items is likely to incur a translationtime p enalty

in the form of excessive mo dule bindingtimes The reason is that a compiler

translating a mo dule which imp orts abstract items will not have available the

necessary information ab out these items in order to complete the translation

Hence some of the work will have to b e deferred to mo dule bindingtime At

the exp ense of sacricing encapsulation it is as has already b een mentioned

p ossible to reveal some or all of the realization of an item by using semiabstract

or concrete exp ort This can reduce the amount of work necessary at mo dule

bindingtime

However the use of concrete or semiabstract exp ort will lead to less safe

programs since clients of such items are free to manipulate the items visible

parts One way to overcome this problem is to imp ose restrictions on the

kinds of op erations a client may p erform on the visible part of an imp orted

item This is as we have seen in Sections and the chosen metho d

in many languages Zuse supp orts a type protection construct based on a

concept prop osed by Liskov and Jones which allows mo dules

A Language with Flexible Encapsulation

IMPLEMENTATION Initialization SPECIFICATION Initialization

TYPE TYPE

T B T CHAR A

T RECORD T T

a T X T T

b REAL T

T RECORD b REAL T RECORD a T b T

RECORD a R b T RECORD a T

T OBJECT b P P T OBJECT

a INTEGER

PROCEDURE P b INTEGER

self T P x CHAR

x CHAR Idx RANGE CARDINAL

BEGIN END P T ARRAY Idx T

END Initialization END Initialization

Figure Examples of Zuse initialization clauses Variables of type T will

receive the initial value A as will variables of type T since T do es not

have its own initialization clause Type T overrides Ts default value and

consequently variables of type T will receive the initial value B T will b e

initialized to an array of As

considerable freedom in expressing how dierent clients may manipulate their

exp orted items

Zuse type protection clauses generalize and extend many of the access con

trol mechanisms found in p opular languages Adas private and limited private

types Cs friend concept Alphards access control Gypsys access

lists and the provide and request clauses of PICAda Their main

novelty is the way in which they may b e combined with Zuse s abstract and

semiabstract types to accurately control the manner in which dierent clients

may access and manipulate dierent parts of an exp orted item

The Syntax of Protection Clauses

Every exp orted type in Zuse may b e coupled with a protection clause listing the

op erations a particular imp orter may p erform on items of that type Examples

of op erations which may b e part of a protection clause include assignment com

p onent selection and type co ercion All kinds of Zuse types may b e protected

including abstract and semiabstract types allowing for very negrained con

trol over the way a mo dules clients may manipulate the mo dules exp orted

Type Protection

items Although types are the only items which may receive an explicit pro

tection clause other exp orted items variables constants and pro cedures may

b e implicitly protected by protecting their type

A Zuse type protection clause is a list of protection items where each item

consists of a friend list and an operator list The friend list describ es the mo d

ules to which the protection item applies M N means mo dules M and N

means all mo dules and an empty friend list means the exp orting mo dule

itself and the op erator list describ es the privileges granted to these mo dules

The general form of a Zuse type protection clause is given in Figure

clause i f hprot item i f hprot item i g g hprot

hprot item i hfriend list i hop list i

list i hmodule list i j hfriend

hmodule list i hident i f hident i g

list i f hop i f hop i g g hop

hop i j j j j j j j WITH j

Figure Extended BNF syntax of Zuses type protection clauses

The protection item M N f g applied to a type T bars mo dules M

and N from applying any op erations on variables of type T other than assign

ment In particular if T is a record M and N will not b e able to access

any of Ts elds although they will b e able to make copies of variables of type

T Similarly the protection item f g allows all mo dules to p erform

additions and type co ercions on variables of the protected type f g allows

the exp orting mo dule to p erform pro cedure calls and M f WITH g grants

M p ermission to inherit and access the metho ds and instance variables of the

protected ob ject type

The Applications of Type Protection

The concrete examples in Figure and in Section show that type pro

tection in Zuse has several imp ortant applications in addition to restricting

clients access to concrete items The most common application of protection

in Zuse is to restrict the op erations clients may p erform on variables of an im

p orted abstract type This is an essential application of protection since Zuse

grants clients full access to imp orted items by default regardless of whether

the item is concrete semiabstract or abstract In most cases a mo dule which

A Language with Flexible Encapsulation

declares an abstract type will restrict all clients access to variables of the type

to assignment f g In exceptional cases however sp ecial clients

friends may b e allowed certain extra privileges such as type co ercion and bit

wise comparison f f g g F f g Such practices should

of course b e strongly discouraged but when on o ccasion their need do es arise

Zuse requires the friendships to b e made explicit

SPECIFICATION Protection

TYPE

Hidden f M N f g g

Address f f g g UNIQUE CARDINAL

Op en f f g g RECORD cd CHAR

Extend f f : g g RECORD

a f f g g CHAR

b f f g f g g CHAR

Wrong f M Q f g g RECORD x Hidden

VARIABLE

OK f f g f g g BOOLEAN

PROCEDURE P f M f g ga CHAR

END Protection

IMPLEMENTATION Protection

TYPE

Hidden f f : g g RECORD xy CARDINAL

Op en f f : g g

Extend f f g gRECORD z REAL

PROCEDURE P a CHAR

END Protection

Figure Zuse type protection examples OK b ehaves like a Mesa readonly

variable Pro cedure P may only b e called from mo dule M All imp orters may

read and up date Extend a and read Extend b but only the exp orting mo dule

may assign to Extend b Note that the exp orting mo dule may extend its own

privileges to an exp orted item in the implementation unit

The Semantics of Type Protection

The static rules that govern the use of type protection in Zuse are consider

ably more complex than those for abstract and semiabstract types and we

will therefore examine these rules using a slighty more formal notation than

Type Protection

previously used in this chapter

To ensure that a program is accesscorrect ie that it do es not illegally

access protected data one has to make sure that a type derived from a protected

type do es not obtain privileges not held by the base type The following set of

rules determines the legality of a declaration of a type MT with the protection

clause T

pr ot

Every op erator in T must b e applicable to T For example if

pr ot

N f g T then T must b e an arithmetic type

pr ot

If T is a comp osite type with a comp onent of type C and if

N f g T then it is required that N f g C Ie if a

pr ot pr ot

comp osite type is to have assignment rights then so must all of its com

p onents The same rule applies to other op erators which aect the entire

type ie and The declaration of ProtectionWrong in

Figure for example is illegal since it gives the mo dule Q assignment

rights although Wrong x do es not have Q f g

If T is an equivalence subrange or unique type with base

type B then T must b e a subset of B Or formally

pr ot pr ot

N f op g T N f op g B

pr ot pr ot

If MT is an ob ject type with sup ertype S then it is required that

Mf WITH g S

pr ot

It is furthermore essential to establish rules to determine the legality with

resp ect to protection of every expression and statement in a program We list

the most essential such rules

d e is legal with resp ect to protection in mo dule M if

a Mf g TYPEd and

pr ot

b Mf g TYPEe and

pr ot

c TYPEd TYPEe

pr ot pr ot

The expression a op b is legal in mo dule M if

a Mf op g TYPEa and

pr ot

b Mf op g TYPEb

pr ot

The protection clause of the expression a op b is the in

tersection of the protection clauses of the sub expressions ie

TYPEa op b is TYPEa TYPEb

pr ot pr ot pr ot

A Language with Flexible Encapsulation

The pro cedure call NP e e is legal in mo dule M if

1 n

a NP is declared in mo dule N as P m f m f where f is the

1 1 n n i

type of the ith formal parameter and m is its transfer mo de REF

i

or VAL and

b Mf g TYPENP and

pr ot

c m REF and the assignment e f is legal in N with resp ect to

i i i

protection and

d The assignment f e is legal in M with resp ect to protection

i i

The last two rules guarantee that NP do es not p erform op erations on

data ob jects passed to it that the calling mo dule would not have b een

allowed to p erform

The last rule states that a mo dule should have write access to variables

which it passes by reference This requirement is lifted if the called pro cedure

is declared in the same mo dule as the variables type This allows one for

example to declare types for which the predened assignment op erator is not

dened and assignment has to b e done through a pro cedure call Similar rules

guide the use of Adas private and limited private types

Evaluation and Examples

Clarity orthogonality and expressiveness are some of the criteria which are

often used to evaluate a programming language These evaluations are often

based on an analysis of the ease and elegance with which the language can

express common data types and data structures such as complex numbers

stacks and trees and algorithms such as sorting and searching Imp ortant

though they may b e programming artifacts such as these represent a very

limited sample of the wide sp ectrum of programming idioms found in typical

complex programs The real strengths and weaknesses of a language can b e

revealed only through longterm use by a large user base in a wide variety of

applications

None of the few languages with a exible encapsulation scheme similar to

Zuse have b een in widespread use for any length of time Ob eron which

has extensible records similar to Zuse has b een sup erseded by Ob eron

which do es not separate mo dule sp ecications and implementations the SRC

Mo dula version compiler which supp orts extensible ob ject types has at

Evaluation and Examples

4

the time of writing just b een released At this p oint therefore there exists

little actual empirical evidence of the utility of exible encapsulation Hence to

evaluate the Zuse type system we are left with the option of giving small and

often contrived examples This section will present a number of such examples

and discuss the pros and cons of the Zuse solution in the light of solutions

given in other languages

Complex Numbers Abstract Structured Types

Figure shows three Zuse variants of a Complex Numberabstraction a

favorite data abstraction example The rst variety represents complex numbers

with a Zuse abstract type the second uses a hidden p ointer type and the

third a protected concrete type The Create Add and Incr op erations are

implemented as abstract inline pro cedures since they can b e exp ected to b e

simple and frequently called Pro cedures which do input or output on the

other hand are unlikely to b enet from b eing expanded inline and we can

therefore declare the Print op eration NOT INLINE

Semantics of Assignment

The complex numberabstraction may app ear trivial but it do es reveal some

subtle semantic dierences b etween the three approaches to data abstraction

The rst complication we will consider is the dierence b etween reference and

value semantics In the cases when ComplexT is realized as a record as in the

rst and last realizations of T in Figure an assignment a b where a and

b are of type ComplexT will have the eect that the value of a is overwritten

with the value of b In other words complex variables will b ehave similarly to

variables of the builtin numeric types REAL and INTEGER which also display

value semantics If on the other hand ComplexT is realized as a p ointer to

a record the second realization of T in Figure the eect of a b will

b e to make a and b refer to the same lo cation This sharing of values has the

side eect that any future change to the variable a using the op eration Incr

for example will aect b and vice versa

Distinguishing b etween the cases when a type exhibits reference or value

semantics is never a problem in Ada or Mo dula A Mo dula programmer

4

As we have seen in Chapter Ob erons extensible records are more restricted than

Zuses since they require that the maximum size of the type b e revealed in the sp ecication

unit The SRC Mo dula compiler resorts to runtime lo okup of the size of parent types in

order to supp ort extensible ob ject types The Zuse translation systems to b e presented have

no extra runtime cost asso ciated with extensible types

A Language with Flexible Encapsulation

SPECIFICATION Complex

Zuse abstract style

TYPE T f f g g

CONSTANT Zero T

Modula style

TYPE T REFf f g g

CONSTANT Zero T

Ada style

TYPE T f f g g RECORD Re Im REAL

CONSTANT Zero T RECORD Re Im

Op erations

PROCEDURE Create Re Im REAL RETURN T

PROCEDURE Add a b T RETURN T

PROCEDURE Incr REF a T b T

NOT INLINE PROCEDURE Print a T

END Complex

IMPLEMENTATION Complex

Zuse abstract style

TYPE T f f : g g RECORD Re Im REAL

CONSTANT Zero T RECORD Re Im

Modula style

TYPE T f f g g REF RECORD Re Im REAL

CONSTANT Zero T REF RECORD Re Im

Ada style

TYPE T f f : g g

Op erations

INLINE PROCEDURE Create Re Im REAL RETURN T

INLINE PROCEDURE Add a b T RETURN T

INLINE PROCEDURE Incr REF a T b T

PROCEDURE Print a b T RETURN T

END Complex

Figure Zuse complex arithmetic example

Evaluation and Examples

knows that opaque types always have reference semantics and an Ada program

mer can determine from the private section of a package sp ecication whether

a private type is realized as a p ointer or not A Zuse programmer on the

other hand cannot determine whether an abstract type will b ehave as a refer

ence or a value merely by insp ecting the exp orting mo dules sp ecication see

Figure One way to handle this problem is to disallow assignment on

the abstract type and to provide clients with a sp ecial assignment pro cedure

with wellspecied semantics Figure Another p ossibility is to overload

the meaning of the assignment op erator however overloading is currently not

a part of the Zuse language

IMPLEMENTATION Complex SPECIFICATION Complex

Value semantics TYPE T

TYPE T RECORD Re Im REAL END Complex

Reference semantics

TYPE T REF RECORD Re Im REAL

END Complex

Figure Value or reference semantics in Zuse

IMPLEMENTATION Complex SPECIFICATION Complex

TYPE T f f : g gRECORD Re Im REAL TYPE T f f g g

INLINE PROCEDURE ValueAssign PROCEDURE ValueAssign

REF a T b T REF a T

BEGIN a b END ValueAssign b T

or END Complex

TYPE T f f g g REF RECORD Re Im REAL

INLINE PROCEDURE ValueAssign

REF a T b T

BEGIN a b END ValueAssign

END Complex

Figure A complex number package with value semantics

Reclaiming Temporary Storage

All mo dern imp erative languages supp ort static as well as dynamic allo cation

but the language design often displays a preference for one or the other Ada for

A Language with Flexible Encapsulation

example was designed primarily with static allo cation in mind this is evident

from the way the representation of private types must b e revealed in the pack

age sp ecication and is undetermined on whether inaccessible storage should

b e reclaimed automatically or manually implementations are allowed but not

required to supp ort garbage collection Mo dula on the other hand relies

primarily on dynamic allo cation ob ject types and opaque types are always

p ointers and assumes automatic storage management Mo dula also favors

dynamic allo cation opaque types are always p ointers but relies on manual

disp osal of unused dynamic storage

These dierent approaches to storage management lead to dierent runtime

b ehavior for programs which use variables of an abstract type Consider the

following program fragment

VARIABLE a b c d ComplexT

BEGIN

d ComplexAdda ComplexAdd b; c

In Ada ComplexT would b e a statically allo cated private type a protected

concrete type in Zuse parlance See the third variant in Figure and

temp orary storage for the result of the addition of b and c would b e stored on

the runtime stack and automatically reclaimed when the scop e is exited In

Mo dula ComplexT would b e a traced opaque reference type and the temp o

rary result of ComplexAdd b c would b e reclaimed by the garbage collector

In Mo dula ComplexT would b e an opaque type and since Mo dula do es

not have garbage collection the memory allo cated for the temp orary complex

5

variable would never b e reclaimed

Abstraction vs Allo cation

We have seen from the ab ove examples how the Zuse abstract type makes

it p ossible to combine the abstractness of Mo dula and Mo dulas opaque

types with the static allo cation of Adas private type In the light of the previ

ous discussion we can now summarize the advantages of static allo cation type

abstraction and the combination of the two

enhanced readability Abstract types do not clutter the sp ecication unit

with information irrelevant to a user of the mo dule

5

See Feldman pp for a thorough discussion of this problem in Mo dula

Evaluation and Examples

ecient compilation Changes to the realization of an abstract type will not

require any client of the exp orting mo dule to b e recompiled

implementation freedom Abstract types allow the implementer complete

freedom in choosing the most appropriate data type for their realization

client indep endence Since abstract types do not reveal their realization

clients are unlikely to dep end on it for their correctness Changes to

the realization are therefore unlikely to aect clients

value semantics An abstract type with a static realization will automatically

display value semantics

simple storage management Static allo cation do es not require any com

plex runtime storage management Even in languages which supp ort

automatic reclamation of dynamic memory it is prudent to use static

allo cation whenever p ossible in order not to unduly tax the garbage col

lector Furthermore in languages such as Mo dula which only supp ort

manual reclamation of storage there is no risk of some storage remaining

unreclaimed

ecient allo cation Static storage needed by a subprogram is created auto

matically on the stack when the subprogram is invoked and reclaimed

automatically when the call returns This is in contrast to the dynamic

storage needed by the subprogram which has to b e created through ex

3 3

plicit calls to the runtime system Initializing a complex matrix

6

see Figure for example will require calls to the runtime system

allo cation routine if ComplexT is a dynamic type

ecient co de Every reference to a variable of a dynamic abstract type will

have to go through one more level of indirection than if the type had b een

static

From the ab ove list one might construe that value semantics and static allo ca

tion ought to b e the preferred choice for all abstract types This is of course

not the case Frequently the desirable b ehavior of assignment on variables of

an abstract type is to copy the identity of the source to the destination rather

than to make a copy of the value of the source In these cases the natural

realization of the abstract type is as a p ointer to some structure

Search Trees Extensible Ob ject Types

Snyder in discussing encapsulation in ob jectoriented languages makes

two observations

A Language with Flexible Encapsulation

PROGRAM P

IMPORT Complex

TYPE Index RANGE INTEGER

VARIABLE

a ARRAY Index Index ComplexT

ij INTEGER

BEGIN

FOR i IN :: DO

FOR j IN :: DO

a ij ComplexCreate

ENDFOR

ENDFOR

END P

Figure Initializing a large complex matrix If ComplexT is realized as a

p ointer to a record each call to ComplexCreate will imply a call to the dynamic

memory allo cation routine

A class may have two kinds of clients those which create and op erate

on ob jects of the class instantiating clients and those which inherit

from the class inheriting clients The two kinds have dierent visibility

requirements the former usually only manipulate the state of an ob ject

through metho dcalls whereas the latter often need to access the private

elds of their sup erclass directly

There are two applications of inheritance one class may inherit from

another for the sole purp ose of reusing its co de implementation inher

itance or in order to declare that its b ehavior is a renement of the

b ehavior of its parent specication inheritance or specialization Since

sp ecication inheritance is a public announcement ab out the b ehavior of

a class any change to a class sp ecication inheritance is likely to aect

its descendants Implementation inheritance on the other hand is pri

vate to the class and neither instantiating nor inheriting clients should

b e aected by a change

In most ob jectoriented languages it is p ossible to make a distinction b etween

instantiating and inheriting clients In C for example protected class mem

b ers can only b e accessed by the metho ds of the dening class and its de

scendants and in Mo dula dierent mo dule interfaces can reveal dierent

amounts of information to instantiating and inheriting clients As a rule how

ever languages do not distinguish b etween implementation and sp ecication

Evaluation and Examples

inheritance and sp ecically do not provide a way to hide the parentchild rela

tionship Mo dula is an exception to this rule since it is p ossible to hide that

ST is the parent of T by letting T inherit from ROOT in the mo dule interface

see Nelson pp

MODULE M INTERFACE M

REVEAL TYPE

Private ST OBJECT END Private ROOT

END M T Private OBJECT END

END M

SPECIFICATION BST

TYPE

Result ENUM Less Equal Greater

No de OBJECT

leftright No de NIL

compare n No de RETURN Result

T OBJECT

ro ot No de

insert n No de Insert

memb er n No de RETURN BOOLEAN Memb er

PROCEDURE Insert t T n No de

PROCEDURE Memb er t T n No de

END BST

Figure Base class for binary search trees Derived classes will normally

use the original memb er metho d but override the insert metho d The imple

mentation unit which implements Memb er and Insert has b een omitted

Zuse currently do es not have a way to reveal dierent amounts of infor

6

mation to inheriting and instantiating clients The Zuse extensible ob ject

type however makes it p ossible to distinguish b etween sp ecication and im

plementation inheritance since the sup ertype of an exp orted ob ject type can

either b e revealed in the sp ecication unit or hidden in the implementation

Figure shows two dierent balanced binary search tree implementations

inheriting from the base class BST Figure The denitions of RedBlackT

and AVLT in Figure are examples of sp ecication inheritance since the

ob ject types reveal their p osition in the type hierarchy Figure on the

6

We are however considering extending the current onetoone mo dule system with a

more general one which would allow dierent views of the same ob ject to dierent types of clients

A Language with Flexible Encapsulation

other hand is an example of implementation inheritance since the sup ertype

of IntSetT is not visible to its clients Figure also illustrates how Zuse

allows instance variables and metho ds to b e hidden or revealed

SPECIFICATION RedBlack

IMPORT BST

TYPE No de WITH BSTNo de OBJECT Red BOOLEAN

T WITH BSTT OBJECT OBJECT insert Insert

PROCEDURE Insert t T n No de

END RedBlack

IMPLEMENTATION RedBlack

PROCEDURE Insert t T n No de BEGIN END Insert

END RedBlack

SPECIFICATION AVL

IMPORT BST

TYPE No de WITH BSTNo de OBJECT

T WITH BSTT OBJECT

END AVL

IMPLEMENTATION AVL

IMPORT BST

TYPE Balance RANGE INTEGER

No de WITH BSTNo de OBJECT Height Balance

T WITH BSTT OBJECT OBJECT insert Insert

PROCEDURE Insert t T n No de BEGIN END Insert

END AVL

Figure Two balanced search tree implementations

Graphs Semihidden Types

Although Zuse makes it p ossible to hide all representational details of every

exp orted item there are sometimes comp elling arguments for making some

asp ects of some items visible We have already seen how an ob ject type often

needs access to the instance variables of its sup ertype but there can b e other

reasons as well

translationtime eciency The Zuse translation systems prop osed in this

thesis see Chapters through are based on the idea that intermodular

Evaluation and Examples

SPECIFICATION IntSet

TYPE T OBJECT

put i INTEGER

in i INTEGER RETURN BOOLEAN

END IntSet

IMPLEMENTATION IntSet

IMPORT AVL

TYPE No de WITH AVLNo de OBJECT value INTEGER

OBJECT compare CompareInt

T WITH AVLT OBJECT

OBJECT put PutInt in InInt

END IntSet

Figure Example of implementation inheritance

information ie information p ertaining to the realization of abstract

items should b e exchanged at mo dule bindingtime A problem with this

solution is that the cost of binding can b e prohibitively high for programs

which contain many abstract items Revealing some information ab out

the realization of exp orted items may reduce this cost

runtime eciency Permitting clients of a mo dule to access the representa

tion of an exp orted type may sometimes allow the application of more

ecient algorithms

The latter of these p oints is illustrated in the graph package of Figure

where the sp ecication reveals either that no des are enumerable or that they

are a subrange of the type SHORTCARD This revelation makes it p ossible for

the depthrstsearch routine in Figure to allo cate a vector of b o oleans

one p er no de to mark the no des which have b een visited

Lines in the Plane Extensible Types and Abstract Ini

tialization

We have already seen examples of how extensible ob ject types can reveal some

of their elds and metho ds and keep some hidden We will next give a further

example of the usefulness of extensible types this time statically allo cated

extensible record types At the same time we will show how extensible types

can b enet from abstract initializations

A Language with Flexible Encapsulation

SPECIFICATION Graph

TYPE

T REF

Edge

No de

or

No de RANGE SHORTCARD

PROCEDURE Create RETURN T

PROCEDURE NewNo de G T RETURN No de

PROCEDURE NewEdge G T ft No de RETURN Edge

END Graph

IMPLEMENTATION Graph

CONSTANT Max SHORTCARD

TYPE No de RANGE SHORTCARD Max

TYPE

Edge RECORD ft No de

T REF ARRAY No de No de BOOLEAN

or

TYPE

Edge REF RECORD to No de next Edge

T REF ARRAY No de Edge

END Graph

PROGRAM UseGraph

IMPORT Graph

PROCEDURE Dfs G GraphT N GraphNo de

BEGIN Visited N TRUE END Dfs

VARIABLE

G GraphT

N GraphNo de

Visited ARRAY GraphNo de BOOLEAN

BEGIN

FOR N IN GraphNo de DO Visited N FALSE ENDFOR

FOR N IN GraphNo de DO

IF Visited N THEN Dfs G; N ENDIF

ENDFOR

END UseGraph

Figure A graph package and a depthrstsearch routine using the package

Evaluation and Examples

Figure shows the mo dule Line which implements op erations on lines

in the plane The only op eration shown is Length which computes the distance

b etween two p oints If Length is called frequently for the same line it may

b e ecient to compute the distance only once and to store the result b etween

calls For this purp ose LineT has an extra hidden eld len which is given its

value the rst time LineLength is called TLength is given the initial value

to distinguish b etween a line for which the length has b een computed and one

for which it has not This scheme of course only works under the assumption

that the From and To elds are not changed b etween calls to LineLength

IMPLEMENTATION Line SPECIFICATION Line

IMPORT Point IMPORT Point

TYPE T RECORD len REAL TYPE T RECORD

From PointT

PROCEDURE Length To PointT

REF l T RETURN REAL

BEGIN PROCEDURE Length

IF llen THEN l:len ENDIF REF l T RETURN REAL

RETURN l:len END Line

END Length

END Line

Figure A mo dule implementing lines in the plane

Protection as a Structuring Tool

Most common mo dular languages treat programs as structureless clusters of

separately compiled mo dules Although mo dules are often logically organized

in a hierarchical fashion there usually do es not exist a way to reect this

organization in the program text The result is that it is p ossible for any mo dule

to invoke the op erations exp orted by any other mo dule The PICAda

provide and request clauses represent one way of restricting the visibility among

the mo dules in a program We will show how the Zuse protection clause can b e

used in a similar fashion Figure a shows the imp ortexp ort relationship

in a program consisting of four separate mo dules and a main program Mo dules

M and M are logically submo dules of M and their exp orted items should

not b e accessible from the main program mo dule P or from M Figure b

shows the corresp onding Zuse program where each exp orted pro cedure has

b een coupled with a protection clause to allow each mo dule to access only the

appropriate mo dules

A Language with Flexible Encapsulation

SPECIFICATION M

PROCEDURE Q f MM f g g

END M

P

SPECIFICATION M

PROCEDURE Q f M f g g

END M

M

M

SPECIFICATION M

PROCEDURE Q f P f g g

M

END M

M

SPECIFICATION M

PROCEDURE Q f MP f g g

END M

a b

Figure Use of protection clauses to indicate the logical structure of a set

of mo dules

When the visibility of all exp orted items in a mo dule is aected in the same

way one might consider assigning a protection clause to the mo dule itself as

exemplied in Figure The clause attached to mo dule M for example

would say that only mo dules M and M may access any of Ms exp orted

items This is currently not part of the Zuse language since Zuse only allows

types to carry protection clauses and Zuse mo dules are typeless

Summary

We have informally presented the exible encapsulation primitives of the lan

guage Zuse and given some examples of their use Zuse supp orts statically

allo cated abstract types semihidden p ointer subrange ob ject and enumer

able types extensible record enumeration ob ject and subrange types ab

stract constants and extensible structured constants as well as abstract and

semiabstract inline pro cedures Furthermore any concrete abstract or semi

abstract item may b e coupled with a protection clause restricting the op erations

which a client mo dule may p erform on the item Protection clauses can also

b e used to provide a hierarchical structure for an otherwise at collection of

mo dules

We have seen how the use of abstract items instead of concrete ones has

several advantages The most imp ortant advantage is that a client of an ab

Summary

SPECIFICATION Mf MM f g g

PROCEDURE Q

END M

SPECIFICATION Mf M f g g

PROCEDURE Q

END M

SPECIFICATION Mf P f g g

PROCEDURE Q

END M

SPECIFICATION Mf MP f g g

PROCEDURE Q

END M

Figure Assigning protection clauses to mo dules

stract mo dule is rid of extraneous information which can obscure the intent

of the mo dule In the same vein an implementer of an abstract mo dule is

given complete freedom to use whichever data structures and algorithms he

wishes without b eing restricted by realization particulars already xed by the

mo dules sp ecication A more pragmatic b enet is that changes to abstract

mo dules are less likely to incur trickledown recompilations than changes to the

corresp onding concrete ones since a greater part of the mo dule is conned to

the implementation unit

Chapter

Sequential HighLevel

Mo dule Binding

Es ist auf die Entwicklung von Compilern

bekanntlich sehr viel M uhe vervandt worden

Viel leicht ware es unter diesen Umstanden

doch richtiger gewesen die Moglichkeiten des

PK in dieser Richtung vol l auszunutzen

Konrad Zuse

paste p eist vi stick with paste

pasting p eisti8 n collo q severe b eating

GetGive sb a pasting

Oxford Advanced Learners

Dictionary of Current English

Introduction

he previous chapters have b een devoted mainly to three tasks trac

the history of abstraction and mo dularity motivating the need for

T ing

exible encapsulation and designing a mo dular language with a exible

encapsulation scheme The remaining chapters will deal exclusively with the im

plementation asp ects of this language In the present chapter we will describ e

the implementation of a sequential paster highlevel mo dule binder which

supp orts Zuses encapsulation primitives while the two subsequent chapters

Chapters and will describ e more sophisticated implementations Chap

ter will evaluate the dierent designs with regard to their eciency

Sequential HighLevel Module Binding

This chapter is organized in four parts The rst part Section outlines

the overall design of our translation system for the Zuse language the second

part comprising Sections through examines lowlevel design decisions

the third part Sections through describ es the two programs which make

up our translation system and p ossible enhancements to these programs and

the fourth part Sections and nally describ es alternative approaches

to the design of translating systems for Zuselike languages

Design Overview

The Zuse translating system zts which we are ab out to describ e consists

of two programs a mo dule compiler zc and a mo dule binder zp In this

resp ect zts do es not dier from most other mo dular language pro cessors The

main dierences instead lie in the way mo dules are b ound together rather than

binding mo dules together exclusively at the machine co de level as is usually

the case zp is designed to bind mo dules together at the intermediate co de as

well as at the machine co de level Furthermore Zp is able to detect and rep ort

intermodular static error conditions such as those discussed in Section and

p erform various intermodular optimizations

Figure shows the actions of zts for a small program consisting of a main

program mo dule P and two mo dules M and N after a change to M s sp eci

cation unit Note that zp unlike the prelinkers employed by many translating

systems p erforms the entire mo dule binding pro cess itself without a call to

the system linkeditor

The Compiler

The Zuse compiler is not very dierent from compilers for other similar mo du

lar languages As can b e seen from Figure zc compiles mo dule sp ecication

units into symbol les and implementation units into les An es

sential p oint is that Zuse do es not allow dep endencies b etween implementation

units and consequently the Zuse compiler has access only to the information

present in the sp ecication units of imp orted mo dules This is evident from

Figure where it is shown that the input to the compiler when pro cessing

a particular mo dule unit in addition to the source co de of the unit itself

consists solely of the symbol les of imp orted mo dules

The most signicant dierence b etween zc and other compilers may b e the

way zc handles translationtime expressions There are many compiletime

1

The particular instance of zp discussed in this chapter is called the sPaster

2

Translationtime expressions will also b e called compiletime or constant expressions

Design Overview

Mspe

Pimp

Msbl

zc

Nsbl

Mimp

Msbl Mcod

Mspe

zc

Nsbl

Mimp

Msbl

Pcod zc

Pimp

Nspe

Pcod

Mcod

Nimp

P

zp

Ncod

Figure Translating actions for the program P after a change to M s sp eci

cation unit The double arrows indicate the les zc and zp read and pro duce

at each stage The suxes spe and imp are used for the source co de les of

sp ecication units and implementation units resp ectively Similarly the sbl

sux is used for the symbol les the result of compiling sp ecication units

and cod is used for the resulting ob ject co de les The executable program

le has no sux

sources of constant expressions sizes of types are constant expressions as are

the values of manifest constants osets of record elds osets of formal pa

rameters in pro cedure activation records addresses of variables and pro cedures

metho d templates of ob ject types etc While most mo dular language compil

ers are always able to evaluate a constant expression for a Zuse compiler this

often will not b e the case Rather since the compiler has access only to the

information available in the symbol les of imp orted mo dules the value of a

translationtime expression asso ciated with an imp orted abstract item will b e

unknown to the compiler For example when compiling a unit which declares

a variable of an imp orted abstract type the size of the variable a constant

expression will b e unknown and the compiler will b e unable to p erform the

allo cation of the variable

Another consequence of the Zuse encapsulation scheme is that it is some

times imp ossible for the Zuse compiler to generate optimal machine co de for

sections of source co de which contain references to imp orted abstract items

Consider as a simple example a pro cedure which p erforms an assignment of

two variables of an imp orted abstract type The compiler not knowing the

They are characterized by the fact that they are evaluated prior to the execution of the

program and that their values are retained until the program is changed

Sequential HighLevel Module Binding

size of the variables will b e unable to select the b est instruction for the assign

ment if the variables are of one of the standard sizes of the target architecture

a byte or a word say then the appropriate simple moveinstruction will b e

sucient If on the other hand the variables are large a more exp ensive blo ck

move instruction will b e necessary Similarly when pro cessing a blo ck of co de

which calls an imp orted abstract pro cedure the compiler will not know whether

the called pro cedure is inline or not and if it is the co de of the pro cedure will

not b e available for expansion

The Zuse compiler handles these and all similar cases in the same manner

if the compiler cannot p erform a particular action b ecause of inadequate inter

mo dular information the action is deferred to mo dule bindingtime when all

the necessary information is available In the example ab ove where a pro cedure

p erforms an assignment on variables of an abstract type the compiler would

leave the pro cedure in its intermediate co de form and pass it on to the mo dule

binder for nal co de generation

The central employed by the zts is the Constant Expression

Table CET which stores all intermodular information The CET for a mo dule

contains entries for every abstract and semiabstract exp orted item as well as

symbolic constant expressions generated during the compilation pro cess An

expression in a mo dules CET can refer to the values of other expressions in

the same CET or in the CET of imp orted mo dules CET expressions are often

arithmetic computing for example the size of a type or the oset of a record

eld but can b e of a variety of other types b o olean string list etc as well

The CET furthermore stores deferred context conditions such as those discussed

in Section as well as callgraphs of pro cedures

The Binder

The paster zp is ztss mo dule binder Starting with the program mo dule its

job is to nd all the mo dules which should b e part of the nal program combine

them and then pro duce the resulting executable program It should b e evident

from the previous discussion that in addition to the actions usually p erformed

by mo dule binders zp also has to p erform some tasks traditionally reserved for

compilers In particular zp has to complete all the actions left undone by zc

deferred constant expressions have to b e evaluated co de has to b e generated for

deferred pro cedures deferred context conditions have to b e checked and error

messages issued if violations are detected In addition to this zp is designed

to p erform various intermodular optimizations such as inline expansion and

deadco de elimination

Intermediate File Formats

Intermediate File Formats

The zts symbol and ob ject co de les share the same format An intermediate

le for a mo dule M consists of a preamble followed by seven ma jor and two

supplementary sections the Module Table is a list of the names of the mo d

ules imp orted by M the Symbol Section holds a symbolic representation of

the items declared by M the Expression Section is a set of unevaluated con

stant expressions the Binary Code and Data Sections contain co de generated

during compilation the Relocation Section contains relo cation information

for the binary co de the Inline Section contains the intermediate co de of in

line pro cedures exp orted by M and the Deferred Code Section contains the

intermediate co de of deferred pro cedures

Preamble

Module Table

Symbol Section

Expression Section

Binary Code Section

Binary Data Section

Relocation Section

Inline Section

Deferred Code Section

Name Section

Statistics Section

Two supplementary sections the Name and Statistics Sections do not

take active any part in the translation pro cess The Name Section which

contains the source co de names of the items dened in M is only used during

debugging and to pro duce bindingtime error messages The Statistics Section

contains information used to pro duce source statistics see Section The

ob ject co de le Preamble contains le p ointers to the start of each section

as well as each sections logical size Logical size is interpreted dierently for

dierent sections for the Expression Section it is the number of expressions

in the mo dule for the Inline and Deferred Code Sections it is the number of

intermediate co de instructions for the Binary Code Section it is the number

of bytes of co de etc

3

This section is generally empty for ob ject co de les

Sequential HighLevel Module Binding

The Constant Expression Table

The Expression Section holds the CET which is the central zts data struc

ture The expressions organize all the intermodular information needed during

pasting they express among other things sizes of types static error condi

tions the allo cation of global variables and the inlinestatus and callgraphs of

pro cedures The general structure of an expression is

e a , f e a e a

mi m i m i n

n n

1 1

where e a is the attribute a of the ith expression in mo dule m f is a function

mi

a s are references to attributes of other with n parameters and the e

m i

k

k k

expressions The function f may b e an arithmetic op eration an op eration to

construct structured constants or metho d templates from their subparts etc In

most cases an expression will b e asso ciated with a particular exp orted abstract

or semiabstract item and each attribute will express one of the items hidden

asp ects Seen this way the CET serves as the global symbol table for abstract

and semiabstract items

The simple example in Figure shows how the CET generated from a

mo dules sp ecication unit contains a slot an expression with an unknown

right hand side for every hidden asp ect of each exp orted item These slots are

lled in when the corresp onding implementation unit is compiled

CET Op erators

The CET op erators and op erand types reect the needs of the translating sys

tem as well as the basic type system of the target language Some of the

op erators are given in Table b elow In addition to the op erand types

shown integer I unsigned integer I real R b o olean B set S error E

source p osition P and address X the CET also supp orts the character C

string G struct U record or array literals and list L types Polymorphic

op erators take op erands of type any A

CET Attributes

Figure on the next page shows that dierent items pro duce dierent kinds

of entries in the CET Every asp ect of an abstract item pro duces a CET entry

as do the hidden asp ects of semiabstract items Concrete items however do

not as a rule pro duce any entries at all since everything ab out a concrete

item is known to an imp orter at compiletime Pro cedures are an exception all

pro cedures whether abstract semiabstract or concrete are represented in the

The Constant Expression Table

e calls , M SPECIFICATION M

M 0

TYPE T

e size , MT

M 1

e value , MC

M 2

CONSTANT C CARDINAL

e inline , MP

M 3

e calls ,

M 3

PROCEDURE P

e addr ,

M 3

PROCEDURE Q

e inline , MQ

M 4

END M

e calls ,

M 4

e addr ,

M 4

e calls , IMPLEMENTATION M

M 0

IMPORT N

e size , e expr e size

M 1 M 5 N 1

TYPE

e value , e value e expr

M 2 N 2 M 6

T RECORD a CHAR b NT

e inline , T

M 3

CONSTANT

e calls ,

M 3

C CARDINAL NC

e addr , nul l

M 4

e inline , F

M 4

INLINE PROCEDURE P

e addr , x

BEGIN END P

M 4

PROCEDURE Q

e calls , e

M 4 N 4

BEGIN NP END Q

e expr ,

M 5

END M

e expr ,

M 6

Figure The CET for a mo dule M Any text following is a comment

Inline pro cedure addresses are given the empty value null T and F signify

TRUE and FALSE Square brackets hold lists of expressions

CET with entries for their address addr and their callgraphs calls This data

must b e present at mo dule bindingtime in order to facilitate relo cation and

deadco de elimination Table summarizes the more imp ortant attributes

In addition to the ab ovementioned attributes each expression has a b o olean

attribute global dened such that e global , T if the value of any of e s

mi mi

attributes may b e used in a mo dule other than m Intuitively e global , T

mi

if e represents an exp orted item but the denition is complicated by the

mi

fact that the b o dy of an exp orted inline pro cedure may refer to an item which

itself is not exp orted The globalattribute is given its value at compiletime

according to the following rules

e global , T if e is asso ciated with an exp orted item and do es not

mi mi

represent an intermediate quantity For example if the size of a type is

represented by a complicated expression with several sub expressions only

Sequential HighLevel Module Binding

operator signature comments

a b a b max a b R R R Real arithmetic

a b a b max a b I I I Integer arithmetic

a b a b A A A Bitwise comparisons

a b a b a B B B Logical op erations

a b a b S S S Set op erations

ka bk I I I ba

ife v v B A A A if e then v else v

1 2 1 2

insert a b o s A A I I A Replace bytes o o s of a with

bytes s of b

extract a s o A I I A Extract bytes o o s from a

+

allocate a I A Allo cate constant value of size a

+

variable a I X Allo cate global variable of size a

check e m p B E P B if e then issue error message m

at source p osition p

Table Selection of CET op erators

attribute type kind items

addr X syn mo dules pro cedures variables

calls L syn mo dules pro cedures

ref B inh mo dules pro cedures

+

size I syn abstract and semiabstract types

low b ound I syn hidden enumerable types extensible subrange

types

high b ound I syn hidden enumerable types extensible subrange and

enumeration types

+

width I syn hidden enumerable types extensible subrange and

enumeration types

+

template size I syn extensible ob ject types

template A syn extensible ob ject types

default A syn types with an abstract or semiabstract default

value

has default B syn types with an abstract or semiabstract default

value

inline B syn abstract pro cedures

value A syn abstract and semiabstract constants

expr A syn literal values and intermediate expressions

Table A selection of CET attributes their type kind synthesized or in

herited and the items to which they apply

The Constant Expression Table

the toplevel expression will b e global

e global , T if there exists an inline pro cedure P exp orted by m such

mi

that the b o dy of P or the b o dy of any inline pro cedure declared in m

and called by P refers to e

mi

Otherwise e global , F

mi

We will leave any further discussion of the globalattribute to Section of

the next chapter where its use will b ecome evident

Borrowing from the terminology of the theory of attribute grammars to

which zts expressions b ear an obvious resemblance we notice that all the

attributes discussed so far have b een of the synthesized kind An expression

attribute is said to b e synthesized if its value is dened solely in terms of the

values of its sub expressions For example since the size of a structured type

is a function of the sizes of its comp onent types the attribute size must b e

synthesized One attribute is an exception the b o olean refattribute which is

dened to b e T if the pro cedure with which it is asso ciated might b e called at

runtime Since a pro cedure can b e called only if any of the pro cedures which

call it may b e called ref is an inheritedattribute At compiletime only the

expressions e asso ciated with mo dule b o dies have e ref , T

mi mi

CET Examples

It is sometimes convenient to think of the CET as a graph where each e a

mi

is a no de lab eled with the op eration f with n outgoing edges to the no des

e a e a In most cases the graph will b e a forest of expression

m i m i n

n n

1 1

DAGs but the following example shows that the graph may sometimes contain

cycles

e calls , M

M 0

e size , e size MT

M 1 N 1

e value , e value e expr MC

M 2 N 2 M 4

e inline , T MP

M 3

e calls , e

M 3 N 3

e expr ,

M 4

e calls , N

N 0

e size , e size e size NT

N 1 M 1 M 1

e value , e value e expr NC

N 2 M 2 N 4

e inline , T NP

N 3

e calls , e

N 3 M 3

e expr ,

M 4

Sequential HighLevel Module Binding

The example corresp onds to the CETs resulting from mo dules M and N in

Figure page in which the declarations of T T and C have b een

removed The illegal declarations of M and N are evident from the concatenation

of the two CETs the recursive denitions of MT and NT corresp ond to a

cycle involving the types sizeattributes and the recursive inline calls can b e

discerned from the callgraph of the pro cedures MP and NP

Other types of deferred static error checks can also b e expressed in the CET

which may b e seen from the following CET for the program in Section

page

e calls , M

M 0

e size , MT

M 1

e calls , P

P 0

e addr , variable e size V

P 1 M 1

e expr , e size e expr

P 2 M 1 P 6

e expr , Illegal type coercion

P 3

e expr , Module P line

P 4

e expr , check e expr e expr e expr

P 5 P 2 P 3 P 4

e expr ,

P 6

Since the two types involved in the co ercion do not have the same size the

test in expression e will fail e expr , F and a side eect of evaluating

P P

e will b e to issue the errormessage

P

Our nal example will b e the construction of metho d templates for ob ject

types Zts implements an ob ject variable as a p ointer to a record containing the

types instance variables together with a p ointer to the types metho d template

see also Sections and This template is a constant record contain

ing among other things the addresses of the types metho ds The metho d

template for an ob ject type T with a sup ertype S is constructed by concate

nating S s template with the addresses of any additional metho ds introduced

by T and replacing those of S s metho ds which were overridden by T In the

case when S is imp orted and not concrete T s template cannot b e built at

compiletime An example of this situation is given in Figure

The Intermediate Form

There are several p otentially timeconsuming op erations that have to b e p er

formed during mo dule binding all involving the program in its intermediate

form Therefore great care has to b e taken in the design of the intermediate

The Intermediate Form

e size , MT SPECIFICATION M

M 1

TYPE T OBJECT

size , e template

M 1

END M

e template ,

M 1

IMPLEMENTATION M

TYPE T OBJECT f CHAR P x CHAR P

PROCEDURE P x CHAR BEGIN END P

END M

e size , MT

M 1

e template size ,

M 1

e template , fxg

M 1

e addr , x MP

M 2

PROGRAM P

IMPORT M

TYPE T WITH MT OBJECT x CHAR Q x CHAR Q

PROCEDURE Q x CHAR BEGIN END Q

END P

e size , e size e expr PT

P 1 M 1 P 3

e template size , e template size e expr

P 1 M 1 P 4

e template , insert e template e addr

P 1 M 1 P 2

e expr e expr

P 4 P 4

e addr , x PQ

P 2

e expr ,

P 3

e expr ,

P 4

Figure Construction of ob ject type templates Irrelevant expressions have

b een omitted The size of PTs instance variable record e size will evaluate

P

to and its template e template will b e fx xg

P

representation The co de must b e compact so that it may b e quickly loaded from

the ob ject co de le lowlevel in order for the co de generation algorithm to b e

simple and ecient and yet highlevel in the sense of enough program structure

b eing retained for the eective application of inline expansion of subprograms

and other co de optimization techniques It is obvious that the resulting repre

sentation must b e a compromise b etween these conicting criteria

Zts uses an attributed linear code which resembles somewhat the interme

diate co des describ ed by Ganapathi and Kornerup A Zco de block

contains the intermediate co de for a pro cedure function or mo dule b o dy and

4

henceforth Zco de

Sequential HighLevel Module Binding

consists of a header a fo oter and ve ma jor sections

Header

Formal Section

Constant Section

Global Section

Local Section

Code Section

Footer

The Header contains a reference to the CET entry for the blo ck The For

mal Section lists the blo cks formal parameters The Constant and Global

Sections are lists of the constant values and global variables referenced in the

blo ck The Local Section declares lo cal variables The Code Section con

tains the actual intermediate co de instructions which represent the b o dy of the

corresp onding pro cedure and the Footer ends the blo ck

The Formal Constant Global and Local Sections will b e called declar

ative since they declare each ob ject used by the blo ck These declarations are

essential to the application of many co de generation and optimization tech

niques which require the unique identication of all referenced ob jects such as

formal parameters global and lo cal variables constants called pro cedures and

compiler generated temp oraries Including these declarative sections within

Zco de blo cks furthermore makes each blo ck a selfcontained unit whose only

nonlo cal references are to CET expressions This will b e convenient as we will

see from the next chapter since Zco de blo cks will sometimes b e transferred

b etween the sites of a distributed system

Each section within a blo ck contains zero or more numbered Zco de instruc

tions tuples made up of an attributed op erator followed by three attributed

arguments Arguments are typically lab els references to other instructions

literal values addresses or references to CET expressions Attributes typi

cally convey type and size information or information which aids in the co de

generation and optimization phases

Zco de Attributes

The table at the top of the next page lists the attributes applicable to Z

co de instructions The Modeattribute indicates the transfer mo de of formal

and actual parameters The Modiedattribute is T when a blo ck mo dies a

particular formal parameter

The Intermediate Form

attribute definition

Block One of ProcFuncModuleProg

+

Type One of the standard types II B R C G XSU of Section

Size A literal integer value a standard size ByteShortLong or a CET

+

reference of type I

Mode One of VALREF

Modied A literal b o olean value T or F

The current implementation only denes three argument attributes

attribute definition

Live A literal b o olean value T or F

Use A literal b o olean value T or F

Ind A literal b o olean value T or F

The Use and Liveattributes indicate whether the argument has a next use

will b e referenced later on in the same basic blo ck or is alive the current

value will b e used later on The Indattribute normally shown as a if T

indicates that the argument is an indirect reference to the actual value

Zco de Instructions

Instruction arguments can b e of four types

argument definition

Label A Zco de tuple number

Value A literal constant such as or Hello or a CET

reference

Address A relo catable address or a CET reference of type X

CET A CET expression

Some of the Zco de instructions can b e found in Tables and Ta

ble contains the declarative instructions of Zco de ie the instructions

which introduce new data items Table shows the imperative instructions

found in the Code Section

Zco de Examples and Discussion

We give one short Zco de example in Figure which mainly shows how

unknown quantities are referenced through the CET

It is natural at this p oint to question the extent to which the chosen repre

sentation fullls the criteria set up at the b eginning of the section or whether

some other intermediate form would have served the purp ose b etter It should

b e clear that Zco de sacrices compactness for ease of co de generation and

optimizations such as inline expansion Languages such as p ostx notation

Sequential HighLevel Module Binding

instruction attributes arguments section

BEGIN Block aCET c g l Label Header

Beginning of blo ck a The arguments c g and l p oint to the rst instructions in the

Constant Global and Local Sections resp ectively

+

FORMAL ModeSizeTypeModied u I Formal

A formal parameter u is the formals usagecount

+

OPEN ModeSizeTypeModied hLabel u I Formal

An op en array formal parameter Size and Type refer to the array element h refer

ences the formal parameter holding the high b ound of the array

+

CONST SizeType v Value u I Constant

A constant value

+

GLOBAL SizeType aAddress u I Global

Reference to a global variable

+

LOCAL SizeType u I Local

A lo cal variable

+

TEMP SizeType u I Code

Compiler generated temp orary

END Block Footer

End of a blo ck

Figure Zco de declarative instructions and the sections in which they o ccur

Most declarative instructions have a usagecount argument u computed in line

with the Dragon b o ok pp

in which the relationships b etween intermediate co de instructions are implicit

rather than explicit are evidently more compact than tuple notations Fur

thermore Zco de wastes space by including type and size information in declar

ative as well as imp erative instructions and by including explicit declarations

for each compiler generated temp orary variable This redundant information

while slowing down the reading and writing of intermediate co de blo cks sim

plies co de generation

In regard to the level of representation the next section will show that Z

co de includes features which make it well suited for inline expansion While

Zco des small instruction set and severely restricted addressing mo des only the

ASSIGN and INITop erators supp ort indirect referencing make it suitable for

RISClike target architectures co de generators which want to take advantage of

complex addressing mo des will have to reconstruct lost context The nextuse

and usagecount information aid co de generation as do es the Modiedattribute

of formal parameters The reason that this type of information is included

5

The current implementation passes large larger than a machine word formal value pa

rameters by reference and it is the callees resp onsibility to make a lo cal copy up on entry If

The Intermediate Form

instruction attributes arguments

ASSIGN SizeType aIndLiveUse bIndLiveUse

a b a b means assignment through a an address

INIT SizeType aIndLiveUse bLiveUse hValue

If h T then a b b will always b e a constant value

LOAD aLiveUse bLiveUse

a is assigned the address of the variable referenced by b

ACTUAL ModeSizeType eLiveUse l

CALL pCET f

CALLIND r LiveUse f

A call to pro cedure p or an indirect call to the pro cedure referenced by variable r

f p oints to the rst ACTUAL of the call l to the previous one e is the value or

address of the actual argument

ADD SizeType aLiveUse bLiveUse cLiveUse

+

a b c Type is I I or R

JUMPLT SizeType aLiveUse bLiveUse L

JUMP L

Conditional jump JUMPLT to L if a b and unconditional jump JUMP to L

RANGE v v Value L

CASE SizeType eLiveUse f d

The RANGEinstruction is a jumptable entry for the values v v causing a

jump to L The CASEinstruction jumps to the lab el asso ciated with the value of

the expression e f p oints to the rst RANGE instruction and d to the default

ELSE instruction

Figure A subset of Zco des imp erative instructions Arguments are of

type Label unless otherwise sp ecied

directly in the intermediate co de is to make each Zco de blo ck selfcontained

There are certainly a host of other intermediate co de forms which could

b e employed instead of Zco de and which might facilitate either optimiza

tion or co de generation or which would improve p ortability Holts

Data Descriptors for example are likely to b e go o d building blo cks for simple

and p ortable co de generators but are less likely to b e suitable for extensive

optimization The Program Dependence Graph and the Static Single As

signment Form are two interesting recent developments designed to

supp ort aggressive optimization techniques It is not clear at this p oint how

ever whether co de generation and optimization from these representations can

b e made ecient enough for our purp oses On the other hand while represen

tations which are closer to the target architecture such as Register Transfer

the callee can determine that it do es not mo dify the formal if Modied F no copy need

b e made

Sequential HighLevel Module Binding

e siz e , SPECIFICATION M

M 1

TYPE T

e def aul t ,

M 1

CONSTANT C CARDINAL

e has def aul t ,

M 1

VARIABLE V T

e v al ue ,

M 2

VARIABLE X CARDINAL

e addr , variable e size

M 3 M 1

PROCEDURE P f T

e inl ine ,

M 4

END M

e cal l s ,

M 4

e addr ,

M 4

instruction arguments PROGRAM N

BEGINProc e P IMPORT M

N 1

FORMALVALe size A T f

M 1

+

CONSTLongI e value MC PROCEDURE P

M 2

CONSTe size A e default MT f MT

M 1 M 1

GLOBALe size A e addr MV VARIABLE

M 1 M 3

+

GLOBALLongI M MX r CARDINAL

+

LOCALLongI r x MT

LOCALe size A x BEGIN

M 1

default INITe size A TT F F e has f MV

M 1 M 1

ASSIGNe size A F F F F r MC MX

M 1

+

TEMPLongI MP x

+

ADDLongI TT F F F F END P

+

ASSIGNLongI F F F F END N

ACTUALe size A F F

M 1

CALL e

M 4

ENDProc

Figure A Zco de translation of the pro cedure P M is the relo catable

address of MX computed by adding to the base address of Ms data area

Usagecounts are prexed by a sign

Lists RTL might improve the sp eed of co de generation the lack

of type and size information for instructions which manipulate abstract items

will make their application dicult It would b e dicult for example to give

an RTL translation of pro cedure Ps rst assignment in Figure since it is

not known at compiletime whether the arguments are small enough to t in

registers or whether they are larger structures which need to b e manipulated

through address registers Furthermore lowlevel representations such as RTL

are less suited to highlevel optimizations such as inline expansion RTL in

lining in the GNU C compiler for example requires lines of co de

whereas the zts inline expansion mo dule is only lines long

6

At compiletime when the intermediate co de for deferred pro cedures is generated

Inline Expansion

Inline Expansion

The merits of pro cedure inlining have for a long time b een the sub ject of much

debate Several issues have b een at stake

Should inlining b e user dened or should it b e the resp onsibility of the

translating system to determine the routine calls which should b e inlined

How can a translating system determine when inlining will b e protable

Which optimizations are likely to b enet from the larger scop e op ened up

by pro cedure inlining

Should optimizations b e p erformed b efore or after inlining

How do es the target architecture aect the b enets of inlining

How can inlining b e p erformed eciently

Mesa Ada and C as well as dialects of many other languages eg

Maclarens Pascal dialect require the programmer to sp ecify the routines

which should b e sub jected to integration This is sometimes referred to as

userdirected inlining Zuse go es one step further by promoting inlining to an

abstraction primitive see Section

Translating systems for languages without explicit inlining directives have

sometimes chosen the other route which is to heuristically determine the rou

tines which will prot from integration The advantage of this metho d is that

it treats inlining as just another optimization technique whose application is

transparent to the programmer Determining the routines and callsites which

will prot from integration has turned out to b e a dicult problem and several

heuristic metho ds have b een devised Ball used data ow analysis to es

timate how inlining calls with constant actual arguments aects optimization

Scheier used a heuristic which inlined small routines routines with just

one callsite and routines which according to collected proling information

were called often McFarling considered the eect of inlining on instruction

cache p erformance also in the light of proling data

7

Inline pro cedures are also known as open pro cedures and inline expansion is sometimes

referred to as pro cedure integration substitution or merging

8

The MIPS compiler suite supp orts b oth inlining directives and an internal heuristics

Sequential HighLevel Module Binding

Several studies have investigated the b enets of inlining and it has some

times b een questioned whether the improvements in executiontime are out

weighed by increased translation time Scheier rep orts executiontime

savings from to unoptimized vs unoptimized integrated co de and

Richardson to optimized vs optimized integrated co de whereas

Himelstein notes that the degradation in cache p erformance outweighs the

b enets of inlining to the p oint that the MIPS compiler has turned o inlining

altogether Furthermore Co op ers study of FORTRAN compilers ability

to capitalize on inlining showed no consistent improvements Holler rep orts

that the Sun C compiler ran on average a factor slower on programs sub

jected to her sourcetosource C inliner while Richardson rep orts that

their inliner integrated in an optimizing compiler slowed down compilation by

a factor of

Additional questions are where in the translation pro cess inlining should

b e p erformed and how inlining across mo dule b oundaries should b e handled

Holler inlines at the source co de level and considers all source mo dules

referenced by a program in order to extract the necessary function denitions

Himelstein concatenates the intermediate co de les of each mo dule prior to

integration Sale suggests linktime intermediate co de integration much

as is done in this thesis Languages such as C and Mesa require the b o dy

of inline routines to app ear in mo dule interfaces in order to b e considered for

crossmo dule inlining

Inlining in zts

Both the zts compiler and mo dule binder are equipp ed to p erform inline expan

sions zc p erforms all intramodular expansions and all intermodular expan

sions of concrete inline routines but defers intermodular expansions of abstract

and semiabstract inline routines to zp We distinguish b etween two kinds of

expansions inlineininline expansions integration of calls to inline routines

in inline routines and expansions of inline in noninline routines Both zc and

zp p erform the two kinds in separate phases Our reasons for separating the

two kinds of inlining will b e made evident in the next chapter In order to

minimize the number of expansions the inlineininline integrations are p er

formed in reverse top ological order ie b ottomup in the inline callgraph

and the remaining expansions are p erformed in a subsequent phase The zts

inliner rejects directly as well as indirectly recursive inline calls ie the inline

9

This technique seems to have b een indep endently discovered by several authors See for

example Hall pp Holler p and Hwu section Allen p

shows however that the quality of the resulting co de can suer from the use of this technique

The Compiler

callgraph is assumed to b e acyclic

We will give a brief description of the zts inliner to illustrate how the Zco de

design makes for very simple and ecient inline expansion In Algorithm

b elow caller and callee are the Zco de tuple numbers of the BEGINinstructions

of the caller and callee resp ectively and callsite is the tuplenumber of the

CALLinstruction which is to b e substituted

Algorithm Expand Inline Call

PROCEDURE Expand caller callee callsite

M MergeSection callerc calleec

M M MergeSection callerg calleeg

M M MapFormals callsitef callee

M M CopyCode callsite calleel

Up date callsitef M

END Expand

Expand starts by merging the Constant and Global Sections of the callee

into the corresp onding section in the caller If the sections are sorted this can

b e done in time linear in the total number of tuples in the sections As part of

the merging pro cess a map M from the callees old Zco de tuple numbers to its

new numbers in the caller is constructed MapFormals maps the callees formal

parameters to the actuals used in the call In some cases this will involve the

copying of an actual into a fresh lo cal variable and this is also p erformed by

MapFormals The copying is necessary for passbyvalue actuals whose corre

sp onding formal parameter is mo died by the callee This is discernible from

the formals Modiedattribute CopyCode copies the callees Local and Code

Sections into the callsite and removes the ACTUAL and CALLinstructions

Finally the inserted instructions are up dated from the tuplemap M The entire

pro cess runs in time prop ortional to the number of tuples in the callee plus the

number of tuples in the callers Constant and Global Sections

The Compiler

Zusetype encapsulation puts a heavy burden on the mo dule binder since the

lack of compiletime intermodular information forces the compiler to defer

many of its tasks to bindingtime The design of zc has therefore b een guided

by the general principle that even when encapsulation restricts the compilers

access to intermodular information as much translation work as p ossible

should b e p erformed at compiletime

Sequential HighLevel Module Binding

With some exceptions zc resembles compilers for traditional mo dular lan

guages and traditional techniques can b e employed for most of zcs translation

tasks The present implementation runs in the six phases shown b elow

Algorithm Compilation

Syntactic Phase

Evaluation Phase

Semantic Phase

Intermediate Co de Phase

Machine Co de Phase

Output Phase

The phases are much the same whether a sp ecication or implementation

unit is compiled except that the compilation of an implementation includes an

initial phase which loads all sections of the corresp onding symbol le and that

the Output Phase generates ob ject co de les from implementation units and

symbol les for sp ecication units We next give a brief description of each of

the ma jor phases

Syntactic Phase

Since Zuse like Mo dula allows pro cedure and variable identiers to b e used

prior to their declaration syntactic and semantic analysis are p erformed in

distinct phases The Syntactic Phase builds the symbol table and the abstract

syntax tree and loads the symbol les of imp orted mo dules

Algorithm Syntactic Phase

Load the Symbol and Inline Sections from the symbol les of imp orted

mo dules In the current implementation we use the B approach to

symbol le pro cessing see Section

Build symbol table entries for the declared items

If we are compiling a sp ecication unit then create CET entries for every

hidden asp ect of every declared abstract or semiabstract item

Build the abstract syntax tree for declared pro cedures

10

The present implementation do es not supp ort concrete pro cedures and hence the Inline

Section will always b e empty

The Compiler

The pro cessing of declarations and the building of the symbol table will call

for the evaluation of many constant expressions These stem from such things

as manifest constants the calculation of sizes of types osets of record elds

etc Because of Zuses exible encapsulation scheme virtually every constant

expression which arises during the compilation of a Zuse mo dule can involve

terms whose values are unknown at compiletime When an expression cannot

b e evaluated due to insucient intermodular information a corresp onding

CET entry is built instead Similarly context conditions such as those in

Section which cannot b e checked at compiletime generate the appropriate

CET checkexpressions

To reduce the amount of work that has to b e done at bindingtime we

employ three techniques to minimize the number of CET expressions

common sub expression elimination

algebraic simplications

declaration reordering

The elimination of common sub expressions turns the CET into a graph usu

ally a DAG rather than a tree There are mainly two cases when the reordering

of declarations pays o record elds and global variables Consider for exam

ple the following declaration e size represents the size of FT e expr the

F F

oset of FTc and e expr the oset of FTd

F

e expr , PROGRAM F

F 1

IMPORT MN

e expr , e size e expr

F 2 M 1 F 1

TYPE T UNSAFE RECORD

e expr , e expr e size

F 3 F 2 N 1

a MT

e size , e expr e expr

F 4 F 3 F 1

b INTEGER

c NT

d INTEGER

END F

If instead the record is reorganized so that elds of known size precede those of

unknown size we are able to save one expression

e expr , PROGRAM F

F 1

IMPORT MN

e expr , e expr e size

F 2 F 1 M 1

TYPE T RECORD

e size , e expr e size

F 3 F 2 N 1

b INTEGER

d INTEGER

a MT

c NT

END F

11

The Zuse UNSAFE keyword is used to prevent reorderings of the kind presented here

This facility is useful mainly when interfacing with other languages

Sequential HighLevel Module Binding

Furthermore in contrast to the rst case when only the oset of a is known

at compiletime the reordering allows us to calculate the osets of b and d This

may save work at mo dule bindingtime since pro cedures which only reference

the a b and d elds of T need not b e deferred

The Evaluation and Semantic Phases

The Evaluation Phase which need only b e p erformed during the compilation

of implementation units is necessary since some of the constant expressions

generated during the compilation of the sp ecication may b ecome evaluable in

the light of the information in the implementation unit

Algorithm Evaluation Phase

Evaluate as many CET expressions as p ossible

Up date the symbol table with the newly computed values

In addition to traversing and decorating the abstract syntax tree pro duced

during the Syntactic Phase the Semantic Phase p erforms the following actions

Algorithm Semantic Phase

Check for static semantic correctness

Build the CET callgraph

Create the appropriate CET checkexpressions for the static checks which

cannot b e p erformed due to insucient intermodular information such

as those in Sections and

Set the globalattribute of all CET expressions according to the rules in

Section

We should note here that in spite of what has b een said not all deferred

constant expressions and deferred static tests need to generate explicit CET

entries For example we do not generate CET expressions that calculate the

activation record osets of lo cal variables and formal parameters and we do

not generate CET checks for the uniqueness of case statement lab els see Sec

tion The reason is that the machine co de generator handles b oth these

cases more eciently on its own without the explicit help of the CET

The Compiler

Intermediate and Machine Co de Phases

The Intermediate Co de Phase pro duces and optimizes Zco de blo cks

Algorithm Intermediate Code Phase

Generate a Zco de blo ck for each pro cedure function and mo dule b o dy

in the abstract syntax tree

Expand inline calls in available inline pro cedures according to a b ottom

up traversal of the inline callgraph

Expand inline calls in noninline pro cedures

Optimize each Zco de blo ck

Before the Machine Co de Phase can commence co de generation it must

determine which pro cedures need to b e deferred Intuitively it is necessary to

defer any Zco de blo ck containing references to the CET but we will show that

in some sp ecial cases this rule can b e relaxed We say that a Zco de blo ck B

needs to b e deferred i it contains an instruction I such that

I is a CALLinstruction and the callee is an imp orted abstract pro cedure

or an imp orted inline semiabstract pro cedure Note that this excludes

calls to lo cal pro cedures imp orted concrete pro cedures and imp orted

semiabstract noninline pro cedures

I is a GLOBALinstruction with an unknown sizeattribute and B p er

forms some op eration such as assignment or bitwise comparison on the

global variable which requires knowledge of its size This excludes op

erations which involve only the globals address such as passing it as a

reference parameter accessing one of its comp onents if it is a record

array or set and taking its address

I is a FORMALREF or OPENREFinstruction with an unknown size

attribute and the conditions in the previous p oint hold

I is any other kind of instruction other than BEGIN and END and one

of I s attributes or arguments contains a CET references

In other words Zco de blo cks which only manipulate the address of abstract

or semiabstract items or the addresses of ob jects which dep end on such items

need not b e deferred The reason is of course that the size of an address is

always known and that it is p ossible to generate co de for a pro cedure even if it

Sequential HighLevel Module Binding

contains references to addresses which are not yet known Filling in the correct

addresses once they are known is handled by the relo cation system describ ed

in the next section We can now conclude this section by listing the tasks

p erformed by the Machine Co de Phase

Algorithm Phase

Compute basic blo ck and nextuse information

Determine which Zco de blo cks need to b e deferred

Generate machine co de for the nondeferred Zco de blo cks

The Sequential Paster

Having examined the design of the Zuse compiler and the data structures

shared by the compiler and the sPaster the sequential paster we can now

turn to the design of the sPaster itself In some ways the sPaster resembles

ordinary systems link editors in other ways it is completely dierent Both the

sPaster and ordinary link editors combine the co de pro duced by a compiler for

separately compiled mo dules into an executable le The sPaster as well as

most link editors runs in several phases the rst phase loading denitions and

the last phase p erforming relo cations However unlike link editors the sPaster

is equipp ed to p erform co de generation intermodular optimization and some

static semantic checking

Our present sPaster design runs in four ma jor phases

Phase determines which mo dules will make up the resulting program

loads the expressions and inline pro cedures of all mo dules and copies the

binary co de and data sections to the resulting executable le

Phase evaluates the expressions allo cates global variables p erforms de

ferred semantic checks determines unreferenced pro cedures and expands

inline calls in inline pro cedures

Phase reads the intermediate co de of every referenced pro cedure up

dates the co de with the expression values calculated during Phase ex

pands inline calls optimizes the intermediate co de generates machine

co de and writes the generated co de to the executable le

Phase p erforms nal relo cation

The Sequential Paster

In addition to providing a nice conceptual framework for discussing the nec

essary tasks of highlevel binders we have several go o d reasons for dividing the

sPasters pro cessing along these lines First of all the sPaster can b e exp ected

to have to deal with very large amounts of data and it is essential to try to

minimize the amount that has to b e kept in primary memory at any one time

It would for any but the smallest programs b e quite unacceptable to try to

store all the information in all the ob ject co de les in primary memory In

stead we op en each ob ject co de le twice each time reading only the necessary

information the rst time during Phase when the necessary global informa

tion is loaded essentially the CET and the inline pro cedures and the second

time during Phase when the deferred pro cedures are read Once Phase has

completed co de generation and optimization for the deferred pro cedures of a

particular mo dule all the data which has b een gathered for that mo dule can

b e safely discarded Only the information read during Phase and computed

during Phase need remain in primary memory during the entire execution of

the sPaster

Separating expression evaluation and inlineininline pro cedure integration

from the rest of the sPasters pro cessing has several advantages First of all

since Phase has completed the reading of the Expression Sections when

Phase commences pro cessing all CET expressions are available to it for eval

uation If instead evaluation ran concurrently with Phase then the evaluation

algorithm would have to b e able to handle the case when a certain expression

was not yet available Another reason for separating the evaluation of expres

sions from the rest of the sPaster pro cessing is that it enables static errors to b e

found early If instead Phase were incorp orated into Phase with expression

evaluation and pro cedure integration pro ceeding on an as need basis a static

error might go undetected until the very end of Phase Furthermore since our

current design evaluates all expressions prior to the start of Phase the co de

generator and optimizer do not need to know ab out unevaluated expressions In

fact the sPasters co de generator is identical to the one used in the Zuse com

piler Finally as we shall see in the next chapter a separate evaluation phase

enables ecient distributed evaluation and replication of CET expressions

While the division of lab or among Phase Phase and Phase may

seem quite natural it is not unreasonable to question the need for a separate

relo cation phase Phase One might instead consider p erforming relo cation

concurrently with Phase and Phase up dating the executable le as new

addresses of pro cedures and other data items b ecome available We will examine

12

The total size of the ob ject co de les for the largest program in the testsuite used in

Chapter is over MB

Sequential HighLevel Module Binding

this approach further in Section of the next chapter Section will

discuss whether a less strict separation b etween the four phases would have a

computational advantage

We will now consider each phase in more detail

Phase 1

Finding the mo dules which are going to take part in the nal program means

nding the closure of the mo dules Module Table This is accomplished by

Algorithm b elow

Algorithm Phase

Let M b e the set of mo dules found so far and let M initially contain the

main mo dule Let U b e a queue with the op erations Insert Delete and

Empty of hitherto unpro cessed mo dules U is also initialized to contain

the main mo dule Let PC b e the program counter

Execute the following co de

PC

M fmain mo duleg

U fmain mo duleg

Create the executable le

REPEAT

n Delete U

Locate and open ns object le and load the preamble

Read ns Module Table L

FOR ALL V SUCH THAT V L V M DO

M M fV g

Put U V

END

Load ns Expression and Inline Sections

FOR EACH e addr DO

ni

Let e addr , e addr PC

ni ni

END

Copy ns Data Section to the executable le

Copy ns Binary Code Section to the executable le

INC PC The size of the Data and Binary Code Sections

Load ns Relocation Section

Close ns object le

UNTIL Empty U

The Sequential Paster

The only part of this algorithm which may require some explanation is

the up dating of the addrattributes As explained in Sections and the

Binary Code Section holds the machine co de generated at compiletime for

pro cedures which do not reference any imp orted abstract items The compiler

assigns pro cedures addresses relative to the start of the mo dule and stores them

in the CETs addrattributes When the Binary Code Section is copied to the

executable le the pro cedures addresses of course change and this is reected

in the up dating of the addrattributes

Phase 2

Phase p erforms two indep endent tasks expression evaluation and inlinein

inline expansion Algorithm describ es these actions

Algorithm Phase

Evaluate all CET expressions in a depthrst traversal Write any gener

ated structured constant data ob jects such as ob ject type templates and

literal records strings sets or arrays or allo cated variables to the exe

cutable le and up date the PC accordingly If any of the deferred static

semantic checks have failed or if the inline callgraph is not acyclic issue

the appropriate error messages and terminate pro cessing

Store generated relo cation information

Fill in calculated expression values from the CET in the Zco de of all

inline pro cedures

Expand inline calls in referenced e ref , T inline pro cedures according

mi

to a b ottomup traversal of the inline callgraph

The relo cation information generated during Phase stems from the con

struction of metho d templates A metho d template contains metho d addresses

but if a metho d is deferred its address will not b e known until Phase Hence

the addresses in the metho d templates will in some cases have to b e lled in

during relo cation Phase

Phase 3

Once Phase has b een completed all intermodular information is available and

Phase can pro ceed to generate co de for deferred pro cedures Since Phase

has determined the pro cedures which may b e referenced at runtime we can

save some translationtime and co de space by ignoring all other pro cedures

Sequential HighLevel Module Binding

Algorithm Phase

PROCEDURE Phase

FOR EACH mo dule n DO

Reopen ns object le

Read ns Deferred Code Section

Process each referenced deferred procedure e

ni

FOR ALL i SUCH THAT e addr , e ref , T DO

ni ni

Let e addr , PC

ni

Fil l in calculated expression values from the CET

Expand calls to inline procedures

IF some calls were expanded THEN

Recompute basic blocks and nextuse information

END

Optimize

Generate machine code and increment PC accordingly

Store the generated relocation information

Write the code to the executable le

END

Close ns object le

END

END Phase

In addition to ignoring unreferenced pro cedures there are a few other opp or

tunities for improving the p erformance of the sPasters Phase One p ossible

optimization is to use ecient machine and op erating system dep endent idioms

for reading the ob ject co de les In the current Unix implementation for ex

ample the mmap is used to map the ob ject co de les directly to

the sPasters address space Furthermore inline and deferred pro cedures are

stored on the ob ject co de les in the form of the compilers memory image of

the relevant Zco de Thus loading the deferred co de segment b ecomes a sim

ple matter of setting a p ointer to the b eginning of the segment and letting the

virtual memory system do the actual reading

In Section we noted that the CET do es not contain explicit checks for

the uniqueness of case statement lab els These checks are instead p erformed

during the sPasters co de generation phase While this may b e more ecient

detecting multiple identical lab els is a trivial extension of the building of the

case statement jump table it do es p ostp one static semantic error detection

from Phase to Phase

13

Similar techniques are commonly used by interpreted systems such as ML and Smalltalk

to allow quick loading of saved execution environments

The Sequential Paster

Phase 4

The last bindingtime action is to up date the generated co de in the executable

le with the pro cedure and data addresses which were not available when the

co de was written This corresp onds to the relocation task commonly p erformed

by link editors The relo cation information loaded during Phase and generated

during Phase and Phase is in the form of an ordered set of relocation items

P C e where PC is a p osition on the executable le where e addr should

mi mi

eventually b e written Phase b ecomes the simple task of considering each

P C e item and up dating the address at p osition PC on the executable le

mi

with the newly computed address e addr

mi

Discussion

It should b e evident and this conjecture will b e corrob orated by the tests

in Section that Phase is likely to b e the most exp ensive of the four

sPaster phases One can imagine a worstcase scenario in which every item in

a program directly or indirectly dep ends on imp orted abstract items where

each exp orted pro cedure is declared inline and where each deferred pro cedure

is referenced Not only would the sPaster have to p erform a large number

of pro cedure integrations but the p ostintegration deferred pro cedures would

most likely b e very large resulting in very exp ensive optimization and co de

generation While we will argue that the distributed and incremental versions

of the sPaster Chapters and will b e able to handle this kind of situation

eciently it should b e clear that the sPaster itself would achieve very p o or

resp onsetime

It is of course also imp erative that the optimization and co de generation al

gorithms used in Phase are b oth eective and ecient The current implemen

tation do es not p erform any of the traditional intraprocedural optimizations

except for some trivial lo cal optimizations such as constant folding and reduc

tion in strength and the co de generation algorithm used the Dragon b o ok

section is far from optimal Future versions of zts will explore the p ossi

bility of using other fast but still p otent co de generators such as Wendt

or Fraser and will implement traditional intra as well as interprocedural

optimizations Wall Callahan Chow It may also b e

p ossible to sp eed up optimization by incrementally up dating nextuse and ow

analysis information after the integration of a pro cedure see Pollock

rather than recomputing it from scratch as is done in Algorithm

In contrast to most Unix compilers the sPaster generates machine co de

directly without going through the system assembler and also pro duces the

executable le directly without as in Figure going through the system

Sequential HighLevel Module Binding

link editor While this improves translation eciency it hamp ers p ortability

and since the link editor is often the instrument which combines mo dules

written in dierent languages makes interfacing to other languages dicult

Future versions of zts will either include a conversion program from the system

ob ject co de format oformat in the case of Unix to the ob ject co de format

used by zts or instrument zp to directly handle ob ject co de les in dierent

formats

Future Work

In this section we will consider a few other translation tasks which would b enet

from precise intermodular information but which are hamp ered by separate

compilation The algorithms sketched in this section are not part of the current

zts implementation but are b eing considered for future versions

Managing Mo dule TimeStamps

Most mo dular languages binders p erform some sort of timestamp checks to

determine if the mo dules taking part in a bind have b een compiled in the

correct order These checks can b e easily incorp orated into the CET evaluation

phase We give each CET mo dule entry a stringvalued G attribute time

stamp which holds the time of compilation of the mo dules sp ecication unit

We furthermore create an entry containing the timestamp of each imp orted

mo dules sp ecication unit as found during compilation and a deferred check

that these timestamps corresp ond to the ones found during mo dule binding

Consider for example a mo dule M imp orting a mo dule N and whose CET is

given b elow e timestamp holds the time of compilation of Ms sp ecication

M

unit and e expr holds the time when the version of N s sp ecication unit

M

imp orted by M was compiled When M and N are b ound together this time is

compared to the timestamp found in N and if they are not the same an error

message is issued

e timestamp , M

M 0

e expr , Ns ts

M 1

e expr , e expr e timestamp

M 2 M 1 N 0

e expr , Wrong compilation order

M 3

e expr , Module M line

M 4

e expr , check e expr e expr e expr

M 5 M 2 M 3 M 4

14

This would o ccur for instance if the mo dule units were compiled in the order Nspe

Mspe Mimp Nspe Nimp

Future Work

Mo died Formal Parameters

In Sections and we saw that the b o olean Zco de attribute Modied is

used during inline expansion and co de generation to avoid unnecessary copying

of passbyvalue formals The pro cedure used to day for calculating the attribute

is conservative a routine is said to mo dify a formal if it either assigns directly

to it takes its address or passes it by reference to another routine A more

accurate metho d would give each formal parameter of each exp orted routine a

CET modied entry

e addr , Pro cedure P

M 1

e mo died , e mo died e mo died x

M 2 Q 5 N 7

e mo died , T y

M 3

Here pro cedure P has two formal parameters x and y which are mo died if

e mo died , T and e mo died , T resp ectively The example indicates

M M

that y is mo died directly by P through an assignment for example and that

x is passed as a reference parameter to two routines in mo dules Q and N and

is hence mo died if these routines in turn mo dify their corresp onding formals

Ecient Subtype Tests

In ob jectoriented programs it is frequently necessary to determine where in the

subtypehierarchy a certain ob ject b elongs Such runtime type tests are some

times inserted implicitly by the compiler when it cannot determine an ob jects

type at compiletime Zuse and Mo dula furthermore supp ort a standard

function ISTYPE a T which returns TRUE if the ob ject as type is a subtype

of the ob ject type T see also Section In addition to the obvious algorithm

for p erforming such tests which runs in time prop ortional to the height of the

inheritance tree there exists two O algorithms Dietz and Cohen

Dietz algorithm requires storing for each ob ject type its pre and p ostorder

number in the inheritance tree The SRC Mo dula compiler assigns these type

co des at logical linktime at runtime but b efore the execution of user co de

which can b e exp ensive for programs with a large number of ob ject types

Future versions of zts will employ the CET to do typeco de assignment at

mo dule bindingtime

The expressions to the left in Figure show the CET prior to evaluation

for three mo dules M N and P Each mo dule exp orts an ob ject type T NT and

15

Holler p uses the same metho d

16

The SRC Mo dula runtime system v requires Sun CPU seconds to pro cess

a linear inheritance tree of ob ject types

Sequential HighLevel Module Binding

e pre , e pre ,

M 1 M 1

e p ost , e p ost ,

M 1 M 1

e sup ertype , nul l e sup ertype , nul l

M 1 M 1

e subtypes , e e e subtypes ,

M 1 N 1 P 1 M 1

e pre , e pre ,

N 1 N 1

e p ost , e p ost ,

N 1 N 1

e sup ertype , e e sup ertype , e

N 1 M 1 N 1 M 1

e subtypes , e subtypes ,

N 1 N 1

e pre , e pre ,

P 1 P 1

e p ost , e p ost ,

P 1 P 1

e sup ertype , e e sup ertype , e

P 1 M 1 P 1 M 1

e subtypes , e subtypes ,

P 1 P 1

Figure Calculating typeco des

PT b oth inherit from MT which itself do es not have a sup ertype The CET

evaluation runs in two steps rst the supertypeattributes are used to build the

inheritance tree the subtypesattributes and then the typeco des are assigned

in a depthrst traversal of the tree The resulting CET is shown to the right in

Figure Once the typeco des have b een assigned they can b e stored in the

metho d templates To test if the type of ob ject S is a subtype of type T we

simply lo ok up the typenumbers and check that pr eT pr eS postT

postS

Inline Iterators

To date only a few languages have incorp orated explicit supp ort for control

abstraction ie iteration over the elements of data abstractions represent

ing collections such as sets or lists CLUs iterators and Al

phards generator forms are two early exceptions We nd the ab

sence of such constructs in most mo dern programming languages regrettable

iteration is after all one of the most common programming activities and

intend to include CLUstyle iterators in future versions of Zuse It will then b e

natural and almost to trivial to extend zts to handle crossmo dule inlining

of nonrecursive iterators Such a facility is not available in any translating

system we are aware of but would b e a great contribution to the eciency

of many programs particularly numerical that rely heavily on nonrecursive iteration

Future Work

Sharing Co de Among Generic Mo dules

Future versions of Zuse will sp ort a generic mo dule facility similar to the one

found in Ada It will then b e natural to extend zts to supp ort co de sharing

among instances of generic mo dules as describ ed for example by Bray and

Rosenberg We will not describ e in detail all the considerations which have

to b e taken into account in order for co de sharing to b e eective but rather

give a simple example and examine one p ossible implementation strategy

Consider the generic mo dule Stack b elow which has a formal type parameter

E the element type and a formal constant parameter Depth the maximum

number of stack elements

SPECIFICATION Stack

TYPE E

CONSTANT Depth CARDINAL

TYPE T

PROCEDURE Push REF s T x E

END Stack

In addition to expressions for its exp orted items Stack also has CET ex

pressions for its formal generic parameters

size , E e instance

Stack 1

e instance value , Depth

Stack 2

e size , T

Stack 3

e addr , Push

Stack 4

Stacks implementation unit may use the formal generic parameters as if

they had b een imp orted abstract items

IMPLEMENTATION Stack

TYPE Idx RANGE CARDINAL Depth

TYPE T RECORD s ARRAY Idx E d Idx

PROCEDURE Push REF s T x E

END Stack

The CET resulting from Stacks implementation unit do es not dier signif

icantly from similar ones seen in Section except for the fact that there are

still undened expressions

size , E e instance

Stack 1

value , Depth e instance

Stack 2

e size , e expr e expr T

M 5

Stack 3 Stack 6

e addr , Push

Stack 4

size e instance value e expr , e instance

Stack 1 Stack 5 Stack 1

e expr ,

Stack 6

Sequential HighLevel Module Binding

One dierence b ecomes evident when we examine a client of Stack and the

corresp onding CET

PROGRAM P

IMPORT Stack

INSTANCE CharStack Stack CHAR

VARIABLE S CharStack

BEGIN CharStackPush S A END P

As seen from Ps CET b elow Stack denes in addition to its own CET

a function instance which takes two parameters corresp onding to its generic

formals and returns an oset into Stacks own CET

e instance size ,

P

value , e instance

P

e size , e size

P

Stack e expr

P 5

e addr , e addr

P

Stack e expr

P 5

size e expr , Stack instancee instance

P P

value e instance

P

The intention is that each time Stackinstance is called during CET evalu

ation which is once for every sourcelevel instantiation of Stack it is decided

whether the instance should share co de with one of the previous instances or

if a new one should b e created In the latter case instance extends Stacks

CET by making a copy of all its expressions substituting the generic actuals

for the formals and returning the number of the rst of the new expressions

In our example Stackinstance would rst create the expressions b elow and

then return During Phase one Push routine would b e generated for each

Push entry in Stacks CET

size , e instance

Stack 7

value , e instance

Stack 8

e size , e expr e expr

M 12

Stack 9 Stack 13

e addr ,

Stack 10

size e instance value e expr , e instance

Stack 8 Stack 12 Stack 7

e expr ,

Stack 13

The simplest denition of instance creates a unique instance for each dis

tinct set of generic actuals and allow co de sharing only b etween instances whose

actuals match exactly A more elab orate denition might for example attempt

to treat one or more of the generic formals as runtime parameters which would

allow more instantiations to fall in the same instance class

Other Methods of Implementation

Other Metho ds of Implementation

It should b e obvious that the implementation strategy advocated in this chapter

may lead to substantial linktime overhead and one may ask whether there exist

any viable alternatives The p ossibilities fall naturally into four categories

Sophisticated pasters Construct faster pasters These would b e similar to

the one describ ed in this chapter but would use more sophisticated trans

lation techniques

Restricted separate compilation Since the translation problems asso ciated

with the Zuse language stem from the use of separate compilation a

translating system which do es away with or restricts the separate compi

lation facilities may achieve faster turnaround times

Reference semantics Use traditional compilation techniques but have the

compiler allo cate variables of unknown size on the heap rather than on

the stack Also whenever necessary manipulate variables by address

rather than by value

General co de Use traditional compilation techniques and allo cate variables

on the stack but generate size indep endent co de for variable references

In other words for an assignment a b where the size of the arguments

is unknown at compiletime generate a blo ckmove instruction

The rst category will b e thoroughly explored in Section and in the next

two chapters The second category can b e divided into systems which retain

a restricted separate compilation facility allowing for example compilation

dep endencies b etween implementation units and those which abandon sepa

rate compilation altogether these include some of the integrated programming

environments examined in Section The third category which is similar to

the strategies used by the CLU Alb erich and SRC Mo dula imple

mentations has several shortcomings Memory allo cation has to b e deferred

till runtime the sizes of types the values of constants and the templates of

ob ject types will have to b e calculated at runtime variable references have

to go through an extra level of indirection and the scheme is likely to heavily

tax the required garbage collection system Implementations which fall in the

fourth category will have less need for garbage collection but will still have to

employ a more complicated runtime memory allo cation scheme since activation

record osets cannot b e computed at compiletime

Sequential HighLevel Module Binding

Allowing compilation dep endencies b etween implementation units as sug

gested in the second category might at rst seem to b e a viable solution Un

fortunately however this metho d renders some p erfectly legal Zuse programs

untranslatable Consider the following two mo dules

IMPLEMENTATION M SPECIFICATION M

IMPORT N TYPE

TYPE T

T INTEGER U

U NT END M

END M

IMPLEMENTATION N SPECIFICATION N

IMPORT M TYPE

TYPE T

T INTEGER U

U MT END N

END N

In this example it is not p ossible to nd an order of compilation b etween

the implementation units of M and N such that the compiler always has all

the relevant information available in order to determine the size of MU and

NU In other words to calculate MUs size Ns implementation unit has to b e

compiled prior to Ms implementation unit and vice versa Note that unlike

the program in Figure this example do es not involve any illegal recursion

One way around the problem of mutually dep endent implementation units is

to allow several units to take part in the same compilation Still remaining

however is the question of how to determine when this is necessary Situations

similar to the one in the example ab ove also o ccur for abstract constants and

abstract inline pro cedures

The last two categories will generally pro duce inferior co de but there may b e

ways to overcome this problem The runtime co de generation and optimization

techniques examined by Kepp el and Chambers might for example

prove valuable Such techniques would allow the compiler to generate inferior

co de and then let the runtime co de generator regenerate optimal co de for

those sections which turn out to b e heavily executed Unfortunately runtime

co de generation is at this time still a largely uncharted area

Hierarchical HighLevel Mo dule Binding

We will end this chapter with a discussion of hierarchical pasting a technique

which might prove useful for programs which contain parts which are rarely changed

Hierarchical HighLevel Module Binding

P

A

A

M

M

A

M M

M M M

M

M

M

Figure Hierarchical binding of a tree of mo dules In the rst step mo dules

fMMMg are joined to form the library A and fMMMg are joined

to form the library A The second step joins mo dules fMMMMg and

library A and the last step joins the main mo dule P with libraries A and A

to form the executable program

Libraries of Mo dules

Many linkeditors allow a set of ob ject les to b e b ound together not to form

an executable program but rather to make a sup erob ject le which may

then take part in later links Such collections of ob ject les are often termed

libraries The prots of such a scheme are twofold collections of mo dules are

easily distributed if joined into one libraryle and it is cheaper for the linker

to pro cess a set of mo dules joined into a library than to pro cess the individual

ob ject les by themselves The gains in linking sp eed stem partly b ecause fewer

les have to b e read and partly b ecause dep endencies b etween mo dules within

a library are resolved once and for all by the linker when the library is created

Figure shows how a tree of mo dules may b e joined in successive steps

rst to form libraries and nally to make the executable program We will term

this pro cess hierarchical binding It should b e clear that hierarchical binding

is most protable when a program contains collections of closely knit mo dules

which have reached a stable p oint in their development cycle Joining such a

collection of mo dules into a library not only increases linker p erformance but

also serves as a means of abstraction the details of the functionality of the

individual mo dules may b e disregarded and replaced by the functionality of the library

Sequential HighLevel Module Binding

The hPaster

Because of the complex dep endencies that exist b etween Zuse mo dules the

p otential prot of hierarchical binding applied to Zuse is greater than for other

similar languages All the intermodular exchange of information p erformed

by the sPaster when creating an executable program can b e p erformed by a

hierarchical Paster hPaster to form a library out of a collection of mo dules

calls to inline pro cedures exp orted by any of the mo dules in the collection can b e

expanded expressions can b e partially or completely evaluated and co de can b e

generated for pro cedures which make use of no more intermodular information

than that which may b e gleaned from the mo dules in the collection With a

few exceptions the algorithms of Section will remain intact

Phase Instead of traversing the imp ort graph to determine the mo dules to

take part in a program Phase is given an explicit list of mo dules to b e

b ound into a library The hPaster Phase creates a new ob ject le rather

than an executable le and gives it the library name selected by the user

The data and co de read from the mo dules of the library is written to the

Data and Binary Code Sections of the ob ject le

Phase The hPaster can not assume as in Algorithm that all necessary

expressions and inline pro cedures are available Instead only those ex

pressions whose arguments are available should b e evaluated and only

calls to those inline pro cedures whose co de is available should b e ex

panded All evaluated or unevaluated expressions are written to the

new ob ject les Expression Section and the inline pro cedures are writ

ten to the Inline Section

Phase The hPasters Phase lls in the newly calculated expression val

ues expands calls to available inline pro cedures and thereafter generates

co de for those deferred pro cedures which contain no further references to

unknown entities The generated binary co de is written to the Binary

Code Section of the ob ject le while the remaining deferred pro cedures

are written to the Deferred Code Section

Phase Phase will in general not b e able to p erform any relo cations since

the nal p osition of the generated co de is not known Instead the relo ca

tion information read during Phase and generated during Phase and

is written to the Relocation Section of the new ob ject le

We should note that the hPasters Phase will have to pro cess al l deferred

pro cedures since unlike in Algorithm the hPaster can not know which

pro cedures may b e referenced at runtime

Summary

Summary

The purp ose of this chapter has b een to investigate the requirements imp osed

on translating systems by languages such as the one presented in Chapter

and to present one concrete translating system design Since our main targets

are system implementation languages our primary design goal has b een the

pro duction of highly optimized co de even if that must b e at the exp ense of

prolonged translation times This goal has lead us to favor our present design

over the alternatives in Section

It would however b e wrong to assume that translating systems such as

zts will always lead to excessively long translation times On the contrary

if we consider the total time sp ent compiling and binding a program over the

programs entire lifetime it may well b e that zts will compare quite favorably

to other more traditional systems The reason is of course that zts allows

mo dules with completely abstract sp ecication units If the information stored

in a sp ecication unit can b e kept at a bare minimum the risk of having

to change that information is minimized and so is the risk of trickledown

recompilations

While the Zuse language was from its conception designed to take advan

tage of highlevel binding other languages that were designed with traditional

translation techniques in mind may also b enet from ztsstyle translation This

is particularly true of Mo dula which in contrast to most other languages has

features that require p ostcompilation exchange of intermodular information

In addition to b eing able to handle Mo dulas opaque and partially revealed

ob ject type zts would also allow Mo dula to b e extended with statically al

located ob ject types to day missing from the language

Others have presented solutions somewhat similar to ours Particularly

relevant are the works of Sale Chow Horowitz and Celen

tano However the works of Sale and Chow app ear in print around the

same time as the rst published account of the work rep orted in this chapter

Collb erg and should b e considered indep endent work It is furthermore

not exactly clear how the MilanoPascal linker was actually implemented since

b oth the rep ort describing the work Paganini in Italian and the im

plementation have apparently b een lost Ghezzi The translating system

prop osed by Sale was never implemented

Our design is unique in many resp ects It is the rst to use linktime pro

cessing to handle problems such as the construction of ob ject type templates

and the implementation of ecient runtime type tests that arise when ob ject

oriented programming meets separate compilation and type encapsulation Pre

vious solutions such as the ones used in the SRC Mo dula implementation

Sequential HighLevel Module Binding

use exp ensive runtime pro cessing instead Also while others have prop osed

the use of linktime pro cessing either to improve encapsulation Celentano and

Sale or crossmo dule optimization Chow and Wall zts is the

rst functional system which is designed to handle b oth We will see from the

following chapters that zts is also the rst system to use distributed and in

cremental translation techniques for improved control over encapsulation and

the rst to use distributed techniques to facilitate intermodular optimization

Chapter

Distributed HighLevel

Mo dule Binding

Law of Probable Dispersal

Whatever it is that hits the fan

wil l not be evenly distributed

Anonymous

Introduction

t should come as no surprise that pasting is a considerably more exp ensive

than conventional linkediting esp ecially for programs which

I op eration

rely heavily on abstract types and abstract inline pro cedures Indeed if

the co de generated during pasting is going to b e sub jected to aggressive op

timization pasting may prove to b e prohibitively slow Fortunately however

pasting is a pro cess which is relatively amenable to parallelization over a dis

tributed pro cessing system This chapter will present a distributed version of

the sequential Paster called a dPaster which runs on a network of workstations

Chapter will present empirical results indicating that distributed pasting of

large programs can b e made as fast as or faster than conventional linkediting

The chapter is organized as follows Section gives an overview of parallel

and distributed architectures in particular the workstationserver mo del We

next discuss various interprocess communication techniques Section es

p ecially the remote pro cedure call proto col Earlier attempts at parallel trans

lation are covered in Section after which we describ e the design of the

Distributed HighLevel Module Binding

distributed Paster Sections through followed by a section on p ossible

future developments Section Finally in Section we discuss imple

mentatation particulars and evaluate our design

Distributed Pro cessing Systems

A distributed system consists of a number of sites connected by a communica

tion network Each site has a pro cessing unit a lo cal memory and a unique

name and can communicate with other sites on the network by means of mes

sage passing In a looselycoupled distributed system message passing is the

only means of communication in a tightlycoupled system pro cessors may also

communicate through shared global storage

There are several variations on the tightlycoupled mo del The multipro

cessor is at the user level analogous to a conventional singlepro cessor multi

pro cess system the dierence b eing the existence of a large p o ol of pro cessors

with a shared memory The op erating system dynamically allo cates pro cess

ors and memory segments to users tasks Array processors consist of a large

number of simple identical pro cessing elements connected in a regular top ology

such as a vector or matrix Array pro cessors fall into the socalled SIMD mo del

singleinstructionmultipledata where at each stage of the computation each

pro cessing element executes the same instruction on its lo cal data

The workstationserver model is a p opular example of a lo oselycoupled dis

tributed system In this mo del a site is either a singleuser computer providing

considerable computing resources and p owerful interaction facilities or a server

providing access to shared resources such as le stores and printers Worksta

tions and servers are linked by a highsp eed Local Area Network LAN such

as the Mbs Ethernet Files are organized in a network le system

allowing workstations access to les stored on remote leservers

Examples of such distributed ling systems are the Xerox Distributed

File System XDFS and the Sun Network File System NFS Figure

shows the logical buildup of the workstation network on which the test results

in Chapter were obtained

One imp ortant characteristic of workstation networks is the fact that at

any one p oint in time there is likely to exist unused computing p ower This

has lead to the design of distributed application programs intended to solve

computationally intensive problems by taking advantage of unused computa

tional resources Examples include distributed compilers distributed anima

tion programs and distributed programs intended to solve large numerical or

numbertheoretical problems

Models of Network Communication

Figure The workstation network on which the test results in Chapter

were obtained The workstations Sun have MB of internal memory

and an internal MB disk They are connected by a Mbs Ethernet The

leserver a Sun has one MB disk

Mo dels of Network Communication

Many distributed applications are built using a simple message passing mo del

of communication That is the application assumes the existence of two prim

itive op erations send to send a packet of data from one site to another and

receive to wait for an incoming packet This mo del has the advantage of

b eing simple to implement the two op erations are often provided as op erating

system primitives but is often cumbersome to use and prone to programmer

error Other higher level mo dels have therefore b een invented often designed

as extensions to a simple message passing proto col Examples include Commu

nicating Sequential Processes CSP Linda and Remote Procedure Cal l RPC

proto cols A Linda program which can either b e a concurrent single

pro cessor program or an application distributed over a network is centered

around a global tuplespace Pro cesses may put tuples named data items in

the tuplespace or fetch tuples from the tuplespace eectively realizing pro

cess synchronization and data interchange CSP is a programming notation

rather than a complete language but CSP style pro cess communication which

Distributed HighLevel Module Binding

P

reply

Server B

request

...

call CP

P

...

...

request

reply

call BDP

Client A

...

Server C

request

Q

...

Client A

call CQ

P

reply

...

reply

Server C

Server D

Client B

Figure A server simultaneously serving two client calls left and a client

making a multicast call to two servers right Each call executes within its

own lightweight pro cess

is based on synchronous message passing and selective communication has b een

embo died in several languages most notably Occam

Remote Pro cedure Call

The semantics of a Remote Pro cedure Call is similar to that of an or

dinary pro cedure call A client program on one site calls a pro cedure executing

on another site a server by sending it a request message including a set of

in parameters and then waiting for a reply When the remote pro cedure has

nished executing a reply message including any out parameters is sent back

to the client which resumes execution at the p oint after the call There are

however some semantic dierences Firstly clients servers or message transfer

may fail Secondly the caller and the callee execute in dierent address spaces

indeed on dierent sites and therefore do not have access to each others

global variables Lastly the cost of an RPC is substantially higher than for a

lo cal pro cedure call a remote call may take p erhaps to times longer

to complete than a lo cal call

The distributed Paster which will b e describ ed later in this chapter has b een

implemented using an RPC proto col similar to the one describ ed in a seminal

pap er by Birrell and Nelson It has the following characteristics

A server may service any number of calls concurrently

Each call on the server executes in its own lightweight pro cess This

means that all calls share access to the same set of global variables

Models of Network Communication

A call may take an arbitrarily long time to complete

Clients use timeouts to detect a failed server Pro cesses are assumed to

b e failstop

In and out parameters may b e any arbitrarily complex data type

The proto col includes broadcast and multicast facilities

Figure left shows two clients simultaneously making a remote pro cedure

call to the same server The two pro cedures P and Q execute concurrently on

the server each within its own lightweight pro cess Figure right is an

example of a multicast remote pro cedure call It is realized as one outgoing

request message to the servers involved and one reply message from each server

back to the client All servers receive the same in parameters but the client

gets one set of out parameters from each server A broadcast is similar to

a multicast except that all servers receive the call and may elect to reply or

not In other words a multicast call to n servers receives n results whereas a

broadcast call may yield zerotoone result from each reachable server

We extend our algorithmic notation from the previous chapter with remote

pro cedure call primitives In the rst example b elow a client C makes a call to

the function P on server S with inparameters x y z When a call to P arrives

at S the actual parameters are copied to the formal parameters and the b o dy

of the function is executed The REPLY statement returns results from the

server back to the client The second example shows C making a multicast

remote pro cedure call to pro cedure P on sites S S receiving s replies in

1 s

return Lastly we see an example of a broadcast remote pro cedure call receiving

an indeterminate number of replies followed by a call discarding all except the

rst reply received

S ENTRY PROCEDURE P abcint int

REPLY a b c

END P

C v CALL S P x y z

C v v CALL S S P

1 s 1 s

C v v CALL P

j k

C v CALL P j

Distributed HighLevel Module Binding

An RPC proto col like the one just describ ed is a very attractive to ol in

the design of distributed applications The similarity to ordinary lo cal calls

makes it easy for an application programmer to understand and use and the

highlevel op erations hide some of the details of network programming such as

retransmissions timeouts and the marshaling of parameters The overhead of

RPC proto cols in general and the current implementation in particular will b e

discussed later in this chapter

Distributed Translation

Much work has gone into adapting language pro cessors for use with parallel and

1

distributed systems Several kinds of problems have b een studied Paralleliz

ing compilers analyze sequential programs in order to detect implicitly parallel

co de which may b e transformed to an explicit parallel form Most of the work in

this area has b een restricted to vectorizing compilers for numerical FORTRAN

programs Parallel compilers try to sp eed up the compilation pro cess by ex

ploiting the inherent parallelism in a compiler and making it run on parallel

hardware Researchers working on distributed compilers fo cus on compilation

sp eedup as well as the creation of distributed multiuser programming environ

ments in some cases under the assumption that the source co de is distributed

over the sites in the system Distributed building studies

the related problem of conguration management in a distributed environment

Mo dels of Distributed Translation

Three conceptually dierent schemes for parallel and distributed compiling have

b een prop osed see Khanna Compilers using functional decomposition

allo cate each compilation subtask lexical syntactic and semantic analysis and

co de generation on dierent communicating pro cessors The lexical analysis

pro cessor reads the source program and transforms it into a stream of tokens

which is passed on to the syntactic analysis pro cessor It in turn will analyze

the syntactic structure and send it on along the pip eline for subsequent seman

tic analysis and co de generation In the data decomposition metho d an initial

pass partitions the source co de into roughly equal sized chunks either arbitrar

ily or at the token or blo ck level The chunks are then pro cessed concurrently

by sub compilers residing on each pro cessor A nal pass merges the results

pro duced by the sub compilers Finally grammatical decomposition lets each

Henceforth we will use the term parallel for tightlycoupled systems such as array pro

cessors The term distributed will b e reserved for lo oselycoupled systems such as the work

stationserver mo del

Distributed Translation

pro cessor handle one particular linguistic construct The source co de is parti

tioned at the statement level and each chunk of co de is passed to the pro cessor

handling the particular type of statement which the chunk contains

Much of the work that has b een done on parallel and distributed compi

lation has assumed that the program to b e compiled is delivered to the com

piler as a monolithic text This is not always the case Large programs will

often b e mo dularized and in a distributed system the mo dules may b e scat

tered among the sites of the system Kaplan and Kaiser describ e a

distributed languagebased environment designed to run on a network of work

stations where a mo dule resides on the workstation of the programmer who is

resp onsible for the development of that mo dule Each mo dule is stored as an

attributed abstract syntax tree which is always kept uptodate with resp ect

to editing changes A collection of to ols such as structureoriented editors

incremental parsers semantic checkers and co de generators work on this com

mon representation of the program and exchange information ab out changes

with to ols op erating on other sites In contrast to the other systems describ ed

here distributed languagebased environments are usually designed not with

eciency of translation but rather eciency of team programming in mind

Mo dularization has also spawned an interest in simple variants of the data

decomp osition metho d where ordinary sequential compilers are made to run

concurrently on dierent sites in a distributed system each compiling one or

several source mo dules Such systems usually termed distributed MAKE pro

grams distributed conguration managers or distributed builders have to con

sider several problems

It must b e decided which sites should compile which mo dules This may

dep end on the current load on each site the pro jected compilation time of

each mo dule and from which sites the source co de of the various mo dules

is accessible

Where there exist compilation dep endencies b etween mo dules compilers

running on dierent sites must b e synchronized

Many conguration managers rely on the mo dication time of source and

ob ject mo dules in order to determine which mo dules have to b e compiled

This may b e a problem if the system time varies b etween sites

Gains of Distributed and Parallel Translation

The advantages of a distributed translator lie as we have seen either in the

distribution of the source which may allow the co op eration of programmers

Distributed HighLevel Module Binding

working on dierent machines or in improved translator p erformance The

computational gain incurred by a parallel algorithm is measured in terms of its

2

speedup its p erformance relative to a sequential algorithm solving the same

problem More formally

Definition Speedup and Efficiency If a parallel algorithm using p

pro cessors terminates in T n time and if the b est p ossible serial algorithm

p



terminates in T n time then the sp eedup of the algorithm is given by the

ratio



T n

S n

p

T n

p

The eciency of the algorithm measures the fraction of time that a typical

pro cessor is usefully employed and is dened as



S n T n

p

E n

p

p pT n

p

 

When the optimal serial time T n of a certain problem is unknown T n

has to b e given a dierent denition One p ossibility is to let S n b e the

p



selfrelative speedup of the algorithm in which case T n is dened to b e the

time required by the parallel algorithm running on one pro cessor

If we have a certain parallel algorithm with a sp eedup of S n it is natural

p

to try to decrease the total solution time of a problem of a given size by increas

ing the number of pro cessors However we will often exp erience that due to

communication penalty there is an upp er b ound on the number of pro cessors

which can b e gainfully employed in solving the problem That is as we increase

the number of pro cessors the amount of communication necessary to co ordi

nate the actions of the pro cessors also increases This is evident from some of

the exp erimental parallel and distributed translation systems which have b een

rep orted in the literature The sp eedups rep orted on are often fairly mo dest

b etween and and are achieved with a small number of pro cessors where the

gain from adding more pro cessors is often negligible or negative We give some

examples of parallel and distributed translation systems from the literature

Baalb ergen rep orts that Amake a distributed MAKE program for

the Amo eba distributed op erating system achieves a selfrelative sp eedup

of with a system of sites when compiling a program made up of

several tens of C source les Doubling the number of pro cessors only

increases the sp eedup to

The terminology and denitions used in this section are taken from Bertsekas and Tsit

siklis

Distributed Translation

Bo ehm and Zwaenenepo el rep ort on a distributed Pascal compiler

running on Sun workstations connected by an Ethernet network A

sequential parser builds a syntax tree which is split up into subtrees and

sent out to attribute evaluators executing on remote sites A network of

workstations is rep orted to achieve a sp eedup of up to

Wortman whose concurrent Mo dula compiler runs

on a shared memory multiprocessor rep orts that the sp eedup achieved

is inuenced by the size of the mo dules compiled For large mo dules

pro cessors achieved a selfrelative sp eedup of whereas pro cess

ors reached a sp eedup of For small mo dules the sp eedups were

and resp ectively The compiler creates a separate compilation pro

cess for each imp orted interface and for each function or pro cedure Each

compilation pro cess is made up of several pip elined tasks starting with a

lexical analyzer and ending with a task which merges the pro duced co de

with the co de pro duced by other pro cesses

Katse presents an assembler running on a messagepassing mul

tipro cessor whose design is based on the data decomp ositiontechnique

The inherent simplicity of assemblers compared to compilers is evident

from the sp eedups rep orted by Katse pro cessors reached a selfrelative

sp eedup of whereas pro cessors achieved Adding more pro cess

ors did not increase the sp eedup

That the simplicity of the compiled language has p ositive eects on the

sp eedup achieved by a parallel compiler is conrmed in Gross Their

crosscompiler which runs on an Ethernetbased network of Sun work

stations and pro duces co de for a systolic array computer achieves a a

sp eedup ranging from to using not more than pro cessors This

sp eedup can also b e explained by the aggressive optimization techniques

3

used which creates ample opp ortunity for parallelization

Distributed Binding

Most of the parallel and distributed compilers rep orted on here suer from the

same problem The distributed compilation phase is followed by a sequential

linkedit phase As linkeditors b ecome more sophisticated

and as programs grow larger and more mo dularized

the linking phase b ecomes something of a b ottleneck in the editcompilelink

The authors rep ort that sequential compilation of a line function takes  minutes

and that compilation times measured in hours are not unusual

Distributed HighLevel Module Binding

cycle of program development Considering that all other asp ects of language

translation have b een sub ject to attempts at parallelization it is remarkable

that apart from a remark by Katse there has yet to b e any work on

parallelizing the linking pro cess Although the rest of this chapter will fo cus

on distributed highlevel binding it is conceivable that some of the results will

carry over to conventional linkediting as well

The Distributed Paster

In the previous chapter we discussed in detail the design of the sPaster a high

level mo dule binder It will b e evident from the empirical tests in Chapter

that pasting is a considerably more exp ensive op eration than conventional link

editing This should come as no surprise since the Paster takes over some of the

tasks usually p erformed during compilation such as storage allo cation inline

expansion optimization and co de generation The rest of this chapter will b e

devoted to the exploration of one p ossible avenue of Paster sp eedup namely

the design of a distributed Paster the dPaster

The distributed Paster can b e said to b e based on the data decomposition

mo del of parallelization in that each site taking part in the build pro cess is

resp onsible for a subset of the mo dules which make up the program The

distributed Paster is made up of three kinds of pro cesses

Master The Master is invoked by a user in the same way he would a sequen

tial Paster It is the main conductor of pro cessing deciding which site

should pro cess which mo dules initiating the various pro cessing phases

and relaying p ossible errors back to the user

Slave There is a maximum of one Slave running on each site All Slaves are

dormant until contacted by the Master They then wake up take part in

a bind orchestrated by the Master after which they again return to their

dormant state

Servant There is one Servant pro cess running on the lo cal network its job

b eing to maintain the result of the bind the executable le Like the

Slaves the Servant pro cess is dormant until awakened at the b eginning of

a bind Static data machine co de and relo cation information pro duced

by the Slaves is sent to the Servant as it b ecomes available The Servant

is resp onsible for writing the co de and data to the executable le and

p erforming the up dates according to the relo cation information received

The Distributed Paster

A B C D

Slave

Master

S

E Servant

Figure A network consisting of one leserver and ve workstations There

are seven active dPaster pro cesses one Master one Servant and ve Slaves

There is no pro cess active on the leserver S

Figure shows a network consisting of workstations AE and a le server

S There are ve Slaves one p er site one Servant pro cess on site B and one

Master on site A

In the following we will b e making certain assumptions ab out the underlying

distributed system

The system is an instance of the workstationserver mo del

In addition to simple messagepassing the network also supp orts broad

casting and multicasting

All sites have access to the ob ject co de les of all mo dules

The algorithms presented later in this chapter will make heavy use of multicas

ting and it is therefore imp erative that the underlying communications system

implements this eciently

An Overview of the Distributed Paster

The design goal of any distributed application is to keep pro cesses as indep en

dent of each other as p ossible The more data that has to b e shared among

pro cesses and the more pro cessing tasks which require pro cesses to synchronize

their activities the more communication overhead will b e generated On the

other hand programming distributed applications is a notoriously dicult task

with many p ossible pitfalls and a clean division of lab or b etween pro cesses

even at the price of extra communication may pay o in a more stable pro

gram Hence we have kept the rst three phases of the sequential Paster in

our design of the distributed Paster The b eginning of each phase serves as the

primary point of synchronization when a Slave has nished the pro cessing of

Distributed HighLevel Module Binding

one phase it must wait for the other Slaves to nish b efore pro ceeding Orga

nizing the pro cessing in this regular fashion has the advantage that every Slave

at the onset of a phase knows exactly which information is available to it and

which is not The disadvantage is a certain loss of Slave indep endence which

may lead to reduced eciency of the Paster

In addition to the three main phases of the distributed Paster whose mis

sions are similar to those of the corresp onding phases in the sPaster we have

included an introductory phase phase resp onsible for initiating communica

tion b etween the pro cesses Phase nds the mo dules which are to take part in

the build and builds the dierent tables used during subsequent phases It also

decides on the distribution of the mo dules among the Slave pro cesses attempt

ing to balance the load on the dierent sites Phase evaluates expressions

expands inlineininline pro cedures and distributes the results to the Slaves

who will need them during phase Phase which is virtually identical to its

sPaster namesake pro cesses deferred pro cedures The remainder of this section

will examine the design of each of the three phases in turn

The following denitions will b e in eect for the rest of this chapter

s The number of Slaves

S S The Slaves

1 s

m The number of mo dules b eing b ound

n The number of expressions and the number of no des

in the expression graph

d The average outdegree of the no des in the expression

graph

The maximum number of expressions p er packet

e

The maximum number of values evaluated expres

v

sions p er packet

Phases and

The design of Phase is interesting mainly b ecause of its signicance to the

eciency of subsequent phases It is during Phase that the distribution

of the mo dules over the Slaves is decided on and this essentially determines

the amount of work each Slave has to p erform A skewed work distribution

will make Slaves arrive at the synchronization p oints at widely diering times

which in turn will result in sub optimal p erformance In addition to trying to

anticipate the Slaves workload the distributed Phase p erforms many of the

same tasks as its sequential counterpart the mo dules which are to take part

in the bind have to b e found as must their corresp onding ob ject les the

constant expressions and the inline pro cedures have to b e read the Binary

Phases and

Code Data and Relocation Sections have to b e pro cessed Furthermore

b efore the pasting can commence the Master must determine which Slaves are

free to help out with the pro cessing and inform them of any sp ecial conditions

which apply to the particular session This may include search paths for ob ject

les and ags to indicate whether co de for debugging or proling purp oses

should b e included load maps should b e generated etc

Organizing the Work

As we have already mentioned the distributed Paster is based on the data de

composition mo del of parallelization each Slave is resp onsible for a subset of

the mo dules which make up the program This means that once a Slave has ac

cepted the task of pro cessing a given mo dule no other pro cess neither Master

Slave nor Servant has direct access to any data p ertaining to that mo dule

The owner Slave is resp onsible for nding the mo dules ob ject le reading and

pro cessing p ertinent data from the le and distributing the information to any

needy pro cess The intention is to try to make Slaves work as indep endently

of each other and of the Master as p ossible A high degree of indep endence

means a high degree of parallelization which hop efully will lead to high factors

of sp eedup During Phase the Slaves do not communicate with each other

at all They receive requests from the Master to pro cess certain mo dules and

send information ab out the mo dules back to the Master Slaves also commu

nicate with the Servant during Phase sending it the Binary Code and Data

Sections read from the mo dules ob ject les

Finding the mo dules which are going to b e part of the nal program entails

nding the closure of the mo dule imp ort graph with resp ect to the main mo dule

To this end the Slaves send the Module Table of each mo dule to the Master

which keeps a module map M mapping the names of mo dules to unique integers

When the Master discovers that a mo dule in a received Module Table is not

yet in M it is inserted and sent to a Slave for pro cessing The Slave reads

the Module Table of the new mo dule and passes it back to the Master This

pro cess continues until all mo dules directly or indirectly referenced by the

main mo dule have b een found and have b een assigned to and pro cessed by a

Slave The Master furthermore distributes relevant parts of the mo dule map

among the Slaves When communicating among each other and with the Master

the Slaves refer to individual mo dules only by their systemwide unique mo dule

numbers which ensures that all Slaves share the same view of the program b eing built

Distributed HighLevel Module Binding

Working in Step

Load balancing attempting to make Slaves work approximately in step by as

signing to each of them the appropriate amount of work is currently achieved

by estimating the time needed to pro cess each mo dule A Slave makes this es

timate for a mo dule assigned to it on the basis of the information gleaned from

the mo dules ob ject le preamble see Section the current load on the site

on which the Slave runs and p ossibly other relevant information The calcu

lated load for a particular mo dule is passed back to the Master which keeps

a record of the total amount of work currently assigned to each Slave When

the Master nds a new mo dule which has yet to b e pro cessed it is assigned to

the Slave which currently has the lightest workload More sophisticated load

balancing algorithms are discussed in Section

To determine the bindingtime cost of pro cessing a mo dule we use an esti

mating function cost of the form

costm s sl md me mi mc

1 2 3 4 5

where m is a mo dule md me mi and mc represent the logical sizes of

the mo dules Binary CodeData Expression Inline and Deferred Code

Sections resp ectively and the s are empirically determined constants The

i

parameter s is the Slave who owns m and sl the current load of the site on

4

which the Slave executes Setting and will assign to all Slaves

1 25

approximately the same number of mo dules regardless of the complexity

of each mo dule whereas and will only balance Phase

5 14

pro cessing

Phase and Algorithms

We are now ready to present the algorithms run by the Master and the Slaves

during the two initial phases Before any actual binding op erations can take

place the Master must make contact with the Servant and the Slaves which are

to take part in the pro cessing Phase is resp onsible for nding available pro

cesses and distributing to them information relevant during the entire pasting

session

Algorithm Phase The Master

Lo cate using a networkwide broadcast one free Slave and retrieve from

it the list of all available free Slaves and the Servant

sl might for example b e for an unloaded site and for a site currently utilizing all

of its resources The current implementation sets sl for all s

Phases and

S S Servant CALL FindSlaves

1 s

Send using a multicast call to all Slaves global data p ertinent to the

session This includes the network address of the Servant command line

switches environment variables ob ject le search paths etc Also inform

the Servant of the name of the program to b e b ound so that it may create

the resulting executable le

CALL S S GlobalData Environment

1 s

CALL ServantCreate Name of program

During Phase the Master rep eatedly selects a Slave with a minimum work

load and sends it an unpro cessed mo dule in a ProcessModule call It simultane

ously listens for incoming ModuleData calls which contain the Module Table

and other data p ertaining to a particular mo dule The mo dules in the Module

Table which are not already in the mo dule map are inserted and are made avail

able for subsequent ProcessModule calls In reply to the ModuleData call the

Master sends among other things the mo dule map identiers of the mo dules

in the Module Table This means that the mo dule map is partially replicated

at each Slave with each Slave holding only the part of the map corresp onding

to the mo dules which it itself owns or which are directly imp orted by these

mo dules The pro cessing terminates when the Master has no more unpro cessed

mo dules and has received a ModuleData call for each of the mo dules in the

mo dule map

Algorithm Phase The Master

Let M b e the mo dule map mapping mo dule names to the integers m

1

M is the reverse mapping M initially contains only the main mo dule

Let U b e a queue with the op erations Insert Delete and Empty of

hitherto unpro cessed mo dules U to o is initialized to contain the main

mo dule

Let Q b e a priority queue initially containing the Slaves S S each

1 s

with the priority Q supp orts the op erations Promote Q S P to

increase Slave S s priority by P and Min Q which returns without

deleting a Slave with minimum priority

Let PC b e the program counter of the resulting program The Master

runs the following Phase main program the termination condition of

the lo op has b een omitted for brevity

Distributed HighLevel Module Binding

PC

m

M fmainmodul e mg

U fmg

LOOP

s Min Q

n Delete U

1

CALL sProcessModule n M n

WAIT Empty U

END

On receipt of a call ModuleData S N F Z I from a Slave S the Master

executes the co de b elow N is the mo dule pro cessed by S F is S s

estimate of the cost of pro cessing N I is N s Module Table and Z is

the size of N s Binary Code and Data Sections

ENTRY PROCEDURE ModuleData S N F Z I int map

FOR ALL V SUCH THAT V I V M DO

INC m

M M fV mg

Insert U m

END

L The part of M corresponding to the modules in I

REPLY P C L

INC PC Z

Promote Q S F

END ModuleData

The program segments ab ove can b e seen as two pro cesses in a producer

consumer relationship The consumer pro cess takes mo dules from the

queue U and sends them o to a Slave for pro cessing while the pro ducer

extracts new mo dules from the Module Tables of incoming calls and

inserts them into U

We see from the ab ove algorithm that the Master is in control of two global

data items the mo dule map M which is partially replicated on each Slave

and the program counter PC which resides solely with the Master Centralized

control of the program counter is necessary since the Slaves will pro duce co de

and data indep endently of each other and each co de or data segment pro duced

must receive a unique lo cation Therefore for each new co de segment generated

the Slaves have to inform the Master of the segments size and in return receive

its start address which is the current value of the Masters PC

Phases and

During Phase the Slaves resp ond to the ProcessModule call from the

Master and themselves make ModuleData calls to inform the Master of p ossible

new mo dules Binary Code and Data Sections read from the ob ject le during

Phase are sent to the Servant in BinaryCode calls Apart from this the Slaves

p erform the same tasks as do es the sequential Paster during Phase

Algorithm Phase The Slave

Let M b e the Slave S s initially empty mo dule map

s

On receipt of a ProcessModule n N call from the Master n is the num

b er and N the name of the mo dule assigned to S the Slave S executes

the co de b elow First the data from the declaration part of the ob ject

le is read and the Master is sent any p otentially new mo dules from the

Module Table together with the size of the Binary Code and Data

Sections and the calculated cost of pro cessing the mo dule In reply the

Slave receives the part of the mo dule map which p ertains to the new mo d

ules in the Module Table and the program lo cation where the binary

co de and data should reside Last the binary co de and data is sent to

the Servant together with the p ertinent relo cation information

ENTRY PROCEDURE ProcessModule n N

REPLY

M M fN ng

s s

Open ns object le and read the preamble

Read ns Module Table Expression and Inline Sections

Read ns data and code C of size Z with relocation R

Close ns object le

F cost n S

I The modules in ns Module Table but not in M

s

P C L CALL MasterModuleData S M F Z I

M M L

s s

A Addresses of code and data objects read

CALL ServantBinaryCode P C C R A

END ProcessModule

Summary Phase

Two imp ortant decisions are made during Phase which mo dules are to take

part in the build and which Slaves should pro cess which mo dules These deci

sions uniquely determine the state of the distributed Paster at the completion

of the phase and the amount of work each Slave will have to p erform during

subsequent phases Each Slave holds the constant expressions and the inline

Distributed HighLevel Module Binding

pro cedures of the mo dules assigned to it and will have to share these ob jects

with the other Slaves during Phase Furthermore each Slave will have to

generate co de for all the mo dules it owns This is a consequence of the load

balancing principle currently employed which is based on an initial estimate of

the amount of work incurred by each mo dule

Phase

Phase is in many ways the part of the Paster which is the most challenging to

parallelize It is during Phase that the Slaves exchange information gathered

during Phase thereby paving the way for Phase In fact all Slaveto

Slave communication is carried out during Phase ridding the computationally

exp ensive Phase from any disrupting communication

As we saw from the previous section Phase partitions the no de set of

the CET graph into s parts one part for each Slave The primary ob jective of

Phase is to evaluate each of the CET expressions and to replicate at least

those values which each slave may need during Phase The partitioning of

the no des of the graph implies that from the viewp oint of a particular slave

5

some no des will b e internal while others will b e external Internal no des

will in general not present any problems while some metho d will have to b e

devised for slaves to access the values of external no des

The next few sections will examine several alternative evaluation metho ds

and in Section we will compare them with resp ect to their message com

plexity

Distributed DepthFirst Evaluation

We will start by examining three unsophisticated algorithms for expression

evaluation and then in the next section pro ceed with the algorithm actu

ally employed in the current implementation First it should b e obvious that

simplicity can b e achieved at the exp ense of a reduction in parallelism

Algorithm sequential Each Slave sends its part of the graph to an

agreed up on sp ecial Slave say S which evaluates the entire graph and multi

1

casts the results to the rest of the Slaves

Algorithm redundant Each Slave multicasts its part of the graph to

all other Slaves Once all Slaves have received every subgraph they evaluate

the complete graph in parallel

We will also talk ab out edges b eing internal or external dep ending on whether they

connect no des residing on the same or dierent sites

Phase

e

31

+

e

32

5

A

e

11

e

21

+

e

41

e

22

7

+ C

B

Site B Site A Site C

e , e e , e e e , e e

21 31 31 32 11 11 41 21

e , e e e , e ,

22 32 41 32 41

Figure The distributed expression graph of four mo dules distributed over

three Slaves The graph contains a cycle comprising expressions e e and

11 21

e

31

Another straightforward solution is to mimic the sequential evaluation metho d

Algorithm distributed depthfirst Let each Slave simultaneously

evaluate its lo cal subgraph depthrst When a Slave comes up on an external

edge it sends a message to the Slave which owns the corresp onding no de with

a request for that no des value

This last metho d requires every Slave to know the owner of every no de in the

graph Furthermore extra care has to b e taken in those cases when the graph

contains a cycle Consider eg Figure where four mo dules M M

1 4

reside on the three Slaves A B and C Circular denitions of manifest constants

in M M and M have resulted in a cycle comprising the no des e e and

1 2 3 11 21

6

e A distributed depthrst evaluation of this graph will result in a deadlo ck

31

situation where Slaves A B and C will b e waiting for values from each other

A number of techniques have b een devised in order to handle distributed

deadlo cks see for example Bracha Helary and Singhal Dead

In this and the following examples the actual attributes involved in a given expression

will b e left unsp ecied

Distributed HighLevel Module Binding

lock avoidance algorithms make sure that deadlo cks can never o ccur by main

taining a global state a socalled waitfor graph which is consulted and up dated

b efore a p ossible deadlo ckcausing op eration is attempted Deadlock detection

algorithms are run at regular intervals or when pro cessing seems to have come

to a standstill to check for the presence of a deadlo ck situation If it is de

termined that a deadlo ck situation is at hand the oending dep endencies are

lo cated and broken Since it would seem that cyclic expression graphs would

b e quite unusual they are after all the result of rather obscure programmer

errors the appropriate way to handle deadlo cks would seem to b e by employ

ing a detection algorithm Under normal circumstances it would not interfere

with the evaluation pro cess We can however get away without a separate

deadlo ckbreaking algorithm by extending the depthrst evaluation algorithm

with a sitemigration facility

Algorithm site migration We allow several lightweight evaluation

pro cesses to execute concurrently at each site p ossibly evaluating the same

value simultaneously When a pro cess needs the value of an external no de

instead of asking the owner for its value the pro cess itself migrates to the

appropriate site and continues the depthrst evaluation In order to avoid

getting itself into a deadlo ck situation it do es so regardless of whether another

pro cess is currently evaluating the same no de A pro cess evaluating a no de

on a cycle will detect the cycle after it has b een completely traversed and the

pro cess has returned to the no de where the evaluation commenced

Consider again the cycle e e and e in Figure Slave A needs the

11 21 31

value of no de e in order to evaluate no de e It therefore migrates to site

11 31

C and from there to site B to evaluate e From site B it returns to no de e

21 11

on site A and can therefore conclude that e is on a cycle

11

Multicast Evaluation

The last metho d we will study and the one employed in the actual implemen

tation makes heavy use of the multicast facility It is a conceptually simple

iterative algorithm where each Slave evaluates as much as it can distributes

the new values to the other Slaves and waits for the arrival of new values

The exact metho d of evaluation dep ends on the nature of the attributes b eing

computed but we give the general structure of the evaluation algorithm run by

Slave S on the next page k

Phase

ENTRY PROCEDURE Evaluate

REPLY

I Initialize

x

LOOP

E Compute new data X

x new

T Determine the data X which should be transmitted

x out

IF X f g THEN CALL S Values k X END

out j j 6=k out

WAIT Incoming Valuescalls

END

END Evaluate

ENTRY PROCEDURE Values k X

in

REPLY

U Store the new data X

x in

END Values

Based on this sketch Algorithm b elow contains the plugin statements

for the evaluation of synthesized attributes The set E holds the expressions

l ef t

which have yet to b e evaluated E the ones which were evaluated during the

new

present round and E the global expression values which are to b e distributed

out

to the other Slaves

Algorithm evaluate synthesized attributes

I E fAl l e a such that S owns m a is synthesizedg

e lef t mi k

E E fAl l e a E whose arguments have known valuesg

e new mi lef t

E E n E

lef t lef t new

T E fAl l tuples m i a V such that

e out

e a E e a , V e globalg

mi new mi mi

U FOR EACH m i a V E DO Let e a , V END

e in mi

Computing the inherited attribute ref is slightly more complicated as can

b e seen from Algorithm b elow On Slave S R starts out containing the

k r ef

set of e s corresp onding to the b o dies of the mo dules owned by the slave

mi

and during each subsequent round it will hold the new e s received from the

mi

other slaves R will hold the e s reachable through the callsattributes

new mi

of the expressions in R R nally holds the set of pro cedure identiers

out

r ef

which are to b e distributed to the other Slaves while R holds those e s

sent mi

which will not need to b e transmitted again

Distributed HighLevel Module Binding

Algorithm evaluate inherited attributes

I R fAl l e such that S owns m e ref g

r r ef mi k mi

R fg

sent

E R fAl l e reachable from some e R g

r new mi nj r ef

T R fAl l tuples m i such that

r out

e R e R S does not own mg

mi new mi sent k

R R R

sent sent new

U R fAl l e such that m i R S owns m e ref g

r r ef mi in k mi

FOR EACH m i R DO

in

Let e ref , T

mi

R R fe g

sent sent mi

END

Although sup ercially uncomplicated careful study of the algorithms ab ove

reveals several subtleties How for example are cycles among the synthesized

attributes discovered And when do es the algorithm terminate It turns out

that these two questions are interrelated Applying Algorithm to the simple

example of Figure spawns the following communication b etween the Slaves

Pass Pass Slave

A fe g f g

32

f g fe g B

22

C fe g f g

41

During the rst round of Algorithm Slave A will multicast the value of e

32

and Slave C will send e Once these values have arrived at Slave B e may

41 22

b e evaluated and its value distributed to the other Slaves This however will

not further any new computations and the pro cessing will come to a standstill

even though there are still three unevaluated expressions e e and e

11 21 31

The fact that there remain expressions which have yet to receive a value and

still no more evaluations are p ossible indicate the presence of one or more

cycles Indeed e e and e form such a cycle

11 21 31

Algorithm displays a similar b ehavior Consider the example in Fig

ure which shows a conguration of three Slaves three mo dules and the

callgraph of thirteen pro cedures The ro ots of the callgraph corresp onding

to mo dule b o dies are b oxed ordinary pro cedures are circled The calculation

of referenced pro cedures which corresp onds to the calculation of the no des

of the closure of the graph with resp ect to the b oxed ro ot no des in this case

runs in three passes

Phase

e

21

e

24

e

23

e

22

A

e e

e

e

31 11

13

32

C

e

e

34

33

e

12

e

14

e

35

B

Site C

Site B Site A

e ,

31

e , e e , e e

11 12 21 13 32

e ,

32

e , e e ,

12 34 22

e , e

33 22

e , e e e , e

13 12 14 23 22

e , e

34 35

e , e e , e

14 33 24 23

e , e

35 12

Figure The distributed callgraph of three mo dules distributed over three

Slaves

Slave Pass Pass Pass

A fe e g f g f g

13 32

B fe g fe g f g

34 33

C f g fe g fe g

12 22

In the rst pass each Slave multicasts the no des reachable from the ro ot no des

and residing on other Slaves Subsequent passes multicast the no des deemed

reachable from the no des which were received in the previous pass After the

third pass the closure has b een computed and pro cessing comes to a standstill

Detecting Termination

We see that b oth in the case of evaluating arithmetic expressions and in com

puting the closure of the callgraph we arrive at a p oint where the distributed

computation ceases b ecause no Slave has anything more to compute and no

Slave receives more input from other Slaves to further its computation Yet it

is imp ossible for any individual Slave in a passive state to determine whether

Distributed HighLevel Module Binding

computation as a whole has ceased or whether some other Slave is active and

may eventually distribute data which may make new computations p ossible

This problem is usually termed the Distributed Termination Detection Problem

DTDP and has b een studied in vast detail

The following denitions are adapted from Huang

Definition Termination A pro cess taking part in a distributed com

putation may b e either active or passive Only active pro cesses may send mes

sages An active pro cess may b ecome passive at any time and a passive pro cess

may b e reactivated by receiving a message A distributed computation is said to

have terminated i all pro cesses are in a passive state and there are no messages

in transfer

A Distributed Termination Detection Algorithm DTDA is a proto col which

runs concurrently with a distributed application and rep orts when said appli

cation has terminated There is a multitude of such algorithms intended for use

with various mo dels of distributed systems Algorithms may assume a certain

network top ology synchronous or asynchronous communication the existence

of an external leader site etc The evaluation algorithm presented earlier in

this chapter has some very sp ecial prop erties which makes it p ossible to em

ploy a new and simple DTDA First of all the evaluation algorithm only makes

use of multicast messages In other words every message sent by one of the

Slaves is received by all other Slaves Secondly all messages are in the form

of remote pro cedure calls This means that messagepassing is synchronous a

Slave knows after having sent a message and having received the replies that

the message has arrived at its destinations Lastly our RPC proto col allows

calls to take an arbitrarily long time to complete Our termination algorithm

makes use of all these facts

Algorithm terminationdetection

Let there b e s sites S S executing a distributed application Sites

1 s

communicate exclusively by multicast remote pro cedure calls

A site is either in an active state p erforming some computation on lo

cal data sending messages to other sites or itself resp onding to such

messages or in a passive state which is characterized by the absence of

lo cal computation and messagepassing A site may enter a passive state

only after having determined that all remote calls it has initiated have

returned all incoming calls have b een returned and that no further lo cal

pro cessing is p ossible

Phase

Let each site S maintain a count C of all multicast messages sent or

i i

received during the execution of the underlying algorithm excluding any

control messages introduced by the termination detection proto col

Cho ose one site say S to b e the detector of termination

1

On entering a passive state S sends a multicast remote pro cedure call

1

DETECT to S S On receipt of a DETECT call site S continues

2 s i

pro cessing until a passive state is reached At this p oint a reply is sent to

S including the current value of C

1 i

Having received replies from all other sites S compares the values of

1

C C If all values are equal S concludes that the underlying al

1 s 1

gorithm has terminated Otherwise the pro cess is rep eated with a new

DETECT call once S again b ecomes passive

1

We will prove the correctness of the algorithm by initially considering a

simpler variant identical to Algorithm except for the following p oints

Algorithm terminationdetectionA

Asso ciate with each message a unique identier and let each site maintain

a set M of the identiers of the messages it has either sent or received

i

On reaching a passive state let S return from the DETECT call with

i

the value of M

i

When S has received replies from all other sites the M s are compared

1 i

for equality If M M M S may conclude that the underly

1 2 n 1

ing algorithm has terminated

Theorem terminationdetectionA Algorithm detects termi

nation i the underlying computation has terminated

Proof

When the underlying computation terminates it has done so b ecause

all sites have determined that there is no more basic computation to b e carried

out Hence all sites have entered a passive state and returned their M M

i i

contains all messages S has seen ie those S has sent itself and those it has

i i

received from fS S S S g If S has sent a message m m must

1 i1 i+1 s i

b e in M since it was sent by S and it must b e in the M s of all other sites

i i j

Distributed HighLevel Module Binding

7

since it was received by them Therefore M M M and the

1 2 n

algorithm will rep ort termination

Assume that Algorithm has rep orted termination that the S s

i

8

have entered their passive state in the order fS S S S g that they

1 2 s1 s

have returned the value M M M M but that there are

1 2 n

still some active pro cesses Then some site say S after having returned M

i i

and entered a passive state will have b een reactivated by a message m M

i

originating from site S Then j i since fS S S g were passive at

j 1 2 j 1

this time But then m M which is a contradiction since M M M

j j i

Therefore we can conclude that the detection algorithm will not rep ort false

termination 2

Theorem terminationdetection Algorithm detects termination

i the underlying computation has terminated

Proof

By the rst part of the pro of of the previous theorem if all basic

computation has ended then in Algorithm M M M But since

1 2 n

C kM k C C C and Algorithm will rep ort termination

i i 1 2 n

We use an argument similar to the one in the second part of the pro of

of the previous theorem If some passive site S was reactivated by a message

i

m originating from site S then j i But since S has seen at least the same

j j

messages as S at the time it go es passive and since m arrived at S after it

i i

went passive and hence is not included in its count C C C which is a

i j i

contradiction Therefore there will b e no detection of false termination 2

It seems reasonable at this p oint to question whether it is actually neces

sary for the sites to maintain even a count of the messages they have seen Let

us for a moment consider an algorithm where one site exactly as in the two

algorithms just presented rep eatedly asks the other sites if they have termi

nated When a site b ecomes passive it replies not with the set of messages

it has seen or even with the message count but with a simple yes A simple

three site example is enough to show that this metho d is not sucient Sites

S and S are b oth active site S is passive and sends a DETECT message

2 3 1

to the other two sites Site S go es passive and returns yes to S Site S

2 1 3

sends a message which reactivates S go es passive and returns yes to S S

2 1 1

Remember that all messages are in the form of multicast remote pro cedure calls and that

a site do es not enter a passive state until all of its outgoing calls have b een completed

After p ossible renumbering If two sites enter their passive state at the same time we

assign an arbitrary ordering b etween them

Phase

has now received an armative reply from the two other sites and therefore

erroneously concludes that the pro cessing has terminated

It should b e noted that Algorithm is not immune to failures As has

already b een mentioned we assume that the underlying RPC proto col detects

failed sites and makes sure that all messages eventually reach their destination

If for example we allow sites to crash only to reenter the system at a later

time a more faulttolerant proto col than Algorithm must b e employed

Distributed Inline Expansion

At the completion of Phase the set of inline pro cedures has b een partitioned

over the s Slaves and each inline pro cedure resides on the Slave which owns

the mo dule exp orting it The task of Phase is to ensure that when Phase

commences each Slave will have direct access to all inline pro cedures referenced

by any mo dule owned by the Slave Once this has b een accomplished each Slave

knows that it has lo cal access to all the global entities it will need during the

pro cessing of Phase all expression values as well as all inline pro cedures

Consequentially Phase will require no interSlave communication

The distributed Phase must accomplish three things in connection with

the inline pro cedures Cycles in the inline graph have to b e discovered inline

ininline calls have to b e expanded and the resulting pro cedures have to b e

distributed to the Slaves where they will b e needed during Phase The detec

tion of inline cycles can easily b e incorp orated into the expression evaluation

algorithms presented earlier In Algorithm b elow the Slaves start by

exchanging information ab out which of their pro cedures are inline This is ac

complished by the remote pro cedures SendInline and IsInline Thereafter the

auxiliary CET attribute icalls given to each inline pro cedure is initialized to

9

the number of calls e makes to external inline pro cedures

mi

Algorithm inline cycle detection

I SendInline

i

I fAl l e such that S owns m e inline g

lef t mi k mi

FOR EACH m i I DO

lef t

FOR ALL e e calls SUCH THAT

nj mi

S does not own n e inline DO

k nj

e icalls e icalls

mi mi

END

END

Algorithm as presented do es not account for cycles contained exclusively on one Slave

Distributed HighLevel Module Binding

E I fAl l e I such that e icalls , g

i new mi lef t mi

I I n I

lef t lef t new

T I fAl l tuples m i such that e a I e globalg

i out mi new mi

U FOR EACH n j I DO

i in

FOR ALL e I SUCH THAT e e calls DO

mi lef t nj mi

e icalls e icalls

mi mi

END

END

PROCEDURE SendInline

I fAl l tuples m i such that S owns m e inline g

out k mi

CALL S IsInline I

j j 6=k out

FOR j s DO WAIT Incoming IsInlinecall END

END SendInline

ENTRY PROCEDURE IsInline I

in

FOR EACH n j I DO Let e inline , T END

in nj

REPLY

END IsInline

It should b e fairly obvious that Algorithm is really a distributed variant

of top ological sorting as presented for example by Mehlhorn pp

The algorithm starts by determining the outdegree of each no de in the inline

callgraph the icallsattribute and then pro ceeds to distribute the set of no des

whose outdegree is On receipt of a set of such no des I a Slave decrements

in

the icallsattribute of each of its remaining inline pro cedures I that calls

l ef t

any of the pro cedures in I The pro cess continues until either I is empty on

in

l ef t

all Slaves or all the pro cessing comes to a standstill due to a cycle in the graph

We illustrate the actions of Algorithm using the callgraph of Figure

on page the example assumes that all pro cedures are inline

SendInline Pass Pass Pass Slave

A fe e e g fe e e g f g f g

12 13 14 22 23 24

B fe e e g f g f g fe g

22 23 24 14

C fe e e e g fe g fe g f g

32 33 34 35 32 33

Pass e e e e e e e e e e

12 13 14 22 23 24 32 33 34 35

Phase

In the table of outdegrees icallsvalues ab ove we note that only calls to

external pro cedures are considered and that prior to the rst pass pass in

the table there are four pro cedures of outdegree e e e These are

22 23 24

transmitted during pass after which only one new pro cedure e gets out

33

degree After the third pass there is no more communication even though

there are still four pro cedures with outdegree These are either part of the

inline call cycle or call pro cedures on the cycle

Once we have determined that the inline callgraph is acyclic we can safely

pro ceed with the actual expansions It should b e clear that it is quite p ossible to

employ techniques similar to any of the ones prop osed for expression evaluation

The current implementation uses a technique similar to Algorithm with

one exception Since inline pro cedures can b e fairly large ob jects esp ecially

after having b een sub ject to a few expansions and the implementation of

our remote pro cedure call proto col is slightly inecient in multicasting large

messages an inline pro cedure is only distributed to those Slaves who actually

need it rather than to all Slaves This scheme is likely to pay o in a system

m

ratio where the probability of a particular with a large mo dule to Slave

s

Slave needing a particular inline pro cedure is fairly small

We can now add one more item to the list of tasks which have to b e accom

plished during the distributed Phase each Slave must b e informed of which

other Slaves need which of its inline pro cedures Fortunately this is a simple

task which can b e easily integrated with the calculation of the refattribute We

give each e representing an inline pro cedure an initially empty set attribute

mi

needed where S e needed on Slave S if S owns e and Slave S needs

j mi mi j

k k

e Algorithm is altered as follows

mi

U R fAl l e such that m i R S owns m e ref g

r r ef mi in k mi

FOR EACH m i R DO

in

Let e ref , T

mi

IF e inline THEN R R fe g END

mi sent sent mi

IF e inline S owns m THEN

mi k

e needed e needed fS g

mi mi j

END

END

The rst IFstatement in the co de fragment ab ove ensures that each slave

will always inform the other slaves of all the inline pro cedures it needs

To complete Phase we merely have to p erform the actual inline expansions

and distribute the results We know from the rst part of Phase that there are

no inline cycles and where the dierent inline pro cedures need to b e replicated

The distributed inline expansion algorithm is sketched on the next page

Distributed HighLevel Module Binding

Algorithm inline expansion

ENTRY PROCEDURE Expand

Replace al l CET references in al l inline procs with the new ly computed values

I fAl l e such that S owns m e inline g

lef t mi k mi

WHILE I fg DO

lef t

Expand al l calls to procedures whose code is available

I fAl l e I without any remaining inline callsg

new mi lef t

I I n I

lef t lef t new

FOR EACH e I DO

mi new

CALL e neededInlineProc m i code of e

mi mi

END

WAIT Incoming InlineProccalls

END

REPLY

END Expand

ENTRY PROCEDURE InlineProc m i code of e

mi

REPLY

Store the code of e

mi

END InlineProc

Summary Phase

We are now ready to put together the bits and pieces of the Phase algorithms

Note that in all algorithms presented in this chapter all intraprocess synchro

nization has b een omitted Since all incoming remote pro cedure calls execute

in their own lightweight pro cesses global data such as the CET the inline

pro cedure store and the global variables calls and active b elow must of course

b e protected using monitors and condition variables or similar concepts

Algorithm evaluation

ENTRY PROCEDURE Evaluate

IF k THEN REPLY END

calls

active TRUE

I I I

i e r

LOOP

E E E

i e r

T T T

i e r

Phase

IF E R I fg fg fg THEN

out out out

CALL S Values k E R I

j j 6=k out out out

calls calls

END

active FALSE

IF k THEN

C C C CALL S S Detect

2 3 s 2 s

IF calls C C THEN REPLY END

2 s

END

WAIT Incoming Valuescalls

END

END Evaluate

ENTRY PROCEDURE Values k E R I

in in in

active TRUE

calls calls

REPLY

U U U

i e r

END Values

ENTRY PROCEDURE Detect

WAIT active

REPLY calls

END Detect

Algorithm Phase The Master

ENTRY PROCEDURE MasterPhase

CALL S S Evaluate

1 s

CALL S S Expand

1 s

END MasterPhase

It is interesting to note that the problem of distributed evaluation of ex

pressions arises in two other areas related to language implementation namely

parallel attribute evaluation and parallel eval

uation of functional programs There are however imp ortant distinctions

b etween expression evaluation as it has b een describ ed in this section and the

way it is treated in these two areas The most imp ortant dierence is the way

expressions are assigned to the sites of the evaluating network In our case

precious little is known ab out the structure of the expressions prior to their

evaluation and hence the expressiontosite mapping has to b e done more or

less at random In the case of parallel evaluation of attribute grammars and

Distributed HighLevel Module Binding

functional programs the expressions to b e evaluated are often known a pri

ori and the goal of the research in these areas is to develop algorithms which

compute optimal expressiontosite mappings

Expression evaluation and graph traversals have also b een studied for par

allel random access machines see for example Gibb ons and Er The

distributed CET is also reminiscent of the replicated blackboards used in AI

systems Engelmore

Phase

The design of Phase ensures that all global data expression values as well as

inline pro cedures needed by a Slave during Phase has b een replicated and

is lo cally accessible Therefore the design of Phase b ecomes uncomplicated

in fact almost identical to its sequential counterpart A further consequence is

that during Phase no communication b etween Slaves is necessary the only

communication o ccurs from the Slaves to the Master to gain access to the

global program lo cation counter and from the Slaves to the Servant to send

the generated co de In other words a Slave will not b e interrupted by any

communication during Phase other than that which it has initiated itself

Algorithm Phase The Master

On completion of Phase the Master sends a multicast DoPhase call

to all Slaves The Slaves return from the call when they have completed

Phase pro cessing

On receipt of a MachineCode D call from a Slave S the Master returns

the current value of the program counter and increments by D

ENTRY PROCEDURE MachineCode D int

REPLY PC

INC PC D

END MachineCode

Algorithm Phase The Slave On receipt of a DoPhase call

from the Master the Slave S starts pro cessing the mo dules it owns one at a

time Unknown values in the linear co de are up dated calls to inline pro cedures

are expanded and machine co de is generated The Master is informed of the

amount of co de generated and the Slave receives the co des new program lo ca

tion The co de its lo cation its relo cation information and the new addresses

of generated pro cedures are passed on to the Servant who writes the co de to

the executable le

Distributed Relocation

ENTRY PROCEDURE DoPhase

FOR EACH mo dule N owned by S DO

Open N s object le

Read N s Deferred Code Section

FOR EACH referenced deferred pro cedure in N DO

Fil l in calculated expression values

Expand calls to inline procedures

IF some calls were expanded THEN

Recompute basic blocks and nextuse information

END

Optimize

END

C Z R Generate machine code C of size Z

with relocation R for the deferred procedures

Close N s object le

PC CALL MasterMachineCode Z

Update procedure addresses with the new PC

A Addresses of generated procedures

CALL ServantBinaryCode P C C R A

END

REPLY

END DoPhase

Once all Slaves have returned from the DoPhase call the Master will

just have to wait for the Servant using a Close call to complete the writing

and up dating of the executable le

Distributed Relo cation

So far we have avoided describing the exact nature of the actions p erformed by

the Servant In fact we have yet to pro duce a clear motivation for the existence

of a Servant pro cess The reason for this lapse is that the issue of how the co de

pro duced by the Slaves is merged to form the resulting program le and how

this le is up dated during relo cation is a very complex one and one whose

solution will have a profound impact on the eciency of the distributed Paster

We will devote this section to examining three dierent solutions in particular

the one currently employed We lab el the three solutions as follows

temp orary Each Slave stores the co de it generates on a temp orary le At

the end of the pro cessing the les are concatenated to form the executable

program le Relo cation can either b e p erformed on the complete pro

gram le by one dedicated Slave in which case that Slave has to b e sent all

Distributed HighLevel Module Binding

the relo cation information and routine addresses generated by the other

Slaves or by each Slave on their own temp orary le prior to concatena

tion In the latter case the Slaves must share the addresses of the routines

they have generated with al l other Slaves

shared The Slaves share ownership of the executable le allowing them to

write to it and up date it concurrently Relo cation can either b e p erformed

by one dedicated Slave or by all Slaves concurrently

servant The Slaves send the generated co de to a sp ecial pro cess a Servant

which maintains the executable le Along with the co de the Servant

must also b e sent the relo cation information and routine addresses gen

erated by the Slaves

The temporary solution is attractive b ecause of its simplicity One drawback

however is that the concatenation of the temp orary les has to b e p erformed

sequentially

The shared solution runs into quite dierent problems since it gives Slaves

shared access to a global le in a network le system Client le caching is

an imp ortant element of most distributed le systems which allows clients to

store lo cal copies of recently accessed le blo cks Changes to the le aect

primarily the lo cal copy and the original le held by the server is up dated

according to some agreedup on convention When several clients concurrently

mo dify the same le it b ecomes necessary to provide mechanisms that assure

that each client holds a consistent set of le blo cks The NFS distributed le

system uses le caching on b oth clients and servers and relies primarily on

clients p erio dically checking the le mo dication date of remote les to protect

against inconsistencies For additional protection NFS provides writethrough

les uncached les and a distributed le lo cking service Writethrough les

force a client issuing a writeop eration to wait until the server has written the

data to the disk The lo cking service consists of a network lock daemon with

which clients can communicate in order to reserve parts of les for exclusive use

Programs such as the distributed Paster where several clients make frequent

concurrent mo dications to the same remote le cannot rely on mo dication

date checks alone to protect against inconsistencies since the checks are only

done p erio dically in NFS every three seconds and there are several sources

of time lags A combination of uncached les and le lo cking is sucient

but only if the lo cking is done on the blo ck level In other words blo cks must

only b e mo died using an atomic readup datewrite sequence lest simultaneous

mo dication of the same blo ck should result in inconsistencies This is a serious

drawback if le blo cks are large in NFS they are k bytes and the average

Future Work

write op eration only aects a small part of the blo ck In this case clients will

frequently b e waiting to obtain the lo ck of a particular le blo ck

The servant approach avoids any problem arising from multiple concurrent

access to the executable le since only one pro cess the Servant itself actually

writes and up dates the le Neither is there any copying overhead as is the case

with the temporary approach Furthermore it is not dicult for the Servant

to p erform relo cations concurrently with the pro duction of co de Figure

gives a sketch of the Servant as it is currently implemented in the dPaster

The Servant consists of two concurrent lightweight pro cesses the Writer and

the Updater The Writer pro cesses incoming co de and data and writes it to

the executable le The Updater up dates the executable le according to the

pro cedure addresses and relo cation information it receives Naturally the two

pro cesses must serialize access to the leblo cks in order not to write to the

same part of the le at the same time A simple and eective way to achieve

serialization is for the Writer to lo ck all leblo cks until they have b een com

pletely written When a leblo ck is complete the lo ck is turned over to the

Updater which can p erform relo cations in the blo ck as so on as the relevant

information is received

Future Work

With the exception of the design of Phase we have so far only examined

one p ossible approach to distributed mo dule binding It is our intention to

investigate p ossible improvements of the current implementation as well as

fundamentally dierent overall designs In this section we will examine alterna

tives to the load balancing algorithm of Section and present a distributed

binder based on functional rather than data decomp osition

Load Balancing Revisited

In Section we describ ed the load balancing scheme used by the current

dPaster implementation This scheme which assigns mo dules to slaves based

on estimates of the amount of bindingtime pro cessing required p er mo dule

is attractive primarily b ecause of its simplicity The main problem with the

metho d is that it is in general imp ossible during Phase to accurately estimate

the amount of work that a given mo dule will imp ose on Phase The reason

is that it is not known until after Phase which deferred pro cedures contain

10

calls to inline pro cedures Not surprisingly the empirical test results in Sec

Obviously a pro cedure which makes many inline calls will b e more exp ensive to pro cess

than one which do es not not only do we have to p erform the actual expansions but the calling

Distributed HighLevel Module Binding

Servant

CreateName

Master

Close

Addresses

Slave

BinaryCodePC C R A

Slave

Relocation

Writer Updater

Slave n

Executable

File

Figure Schematic structure of the Servant The BinaryCode calls re

ceived from the Slaves contain generated co de and data C relo cation in

formation R and addresses of generated routines A The co de is written

to the executable le by a lightweight pro cess Writer R and A are stored in

tables which are used by a lightweight pro cess Updater op erating concurrently

with Writer to up date the le

tion regarding the p erformance of the dPasters Phase indicate that the

current load balancing algorithm works adequately for programs with a small

amount of inlining The algorithm p erforms less well however when there is a

large number of inline calls and the number of calls dier signicantly b etween

mo dules The most imp ortant future developments of the dPaster will therefore

b e the introduction of new and more accurate load balancing schemes

Load balancing algorithms fall in two categories static and dynamic

Static algorithms are characterized by the fact that pro cessing tasks remain at

the pro cessor to which they were initially assigned for the lifetime of the pro

gram Dynamic algorithms allow tasks to migrate b etween pro cessors whenever

this improves eciency Since our current inadequate load balancing algorithm

is static it is natural to consider whether dynamic algorithms may fare b et

pro cedure will b e substantially longer after the integrations and therefore more exp ensive to

pro cess during co de generation

Future Work

ter There are several p ossible dynamic algorithms but we will restrict our

presentation to two metho ds which present themselves naturally

The rst metho d which we will call Global Scheduling disp enses with the

static load balancing metho d altogether and gives the Master full dynamic con

trol over the mo duletoslave allo cation Under the Global Scheduling strategy

the Master employs a similar approach for Phase as it currently uses for

Phase a p o ol of hitherto unpro cessed mo dules is maintained and used to

assign mo dules to Slaves on an asneeded basis In other words when a Slave

has nished a mo dules Phase pro cessing a remote call is made back to the

Master to retrieve a new mo dule Unlike the present static load balancing al

gorithm which only partially replicates CET values and inline pro cedures on

the assumption that a mo dule will remain at the Slave to which it was origi

nally assigned during Phase the Global Scheduling metho d will require full

replication of all global data

The second metho d Module Migration extends the basic static algorithm

with facilities to dynamically handle cases in which the original mo duletoslave

assignment is found to b e unbalanced The basic idea is to let the Slaves p er

form Phase pro cessing of the mo dules assigned to them without intervention

unless one or more Slaves nish long b efore the others When that happ ens the

Master or the Slaves themselves orchestrate a redistribution of the remain

ing unpro cessed mo dules In other words during the rst part of Phase the

Slaves work indep endently of each other and the Master but towards the end

of the phase they co op erate in order to allow mo dules to migrate from busy to

idle Slaves The advantage of this metho d compared to Global Scheduling is

that when the original Phase mo dule distribution is accurate there will b e no

unwarranted extra communication Furthermore unlike the Global Scheduling

metho d which requires ful l replication of global data the Mo dule Migration

metho d only requires migrating mo dules to bring along lo cal expression values

and inline pro cedures in the move

Determining which of these metho ds will prevail will b e the sub ject of future

investigations We will b e particularly interested in nding out whether the full

data replication required by Global Scheduling and the lo cal data migration

required by Mo dule Migration will cancel out the advantages gained from the

more precise load balancing aorded by these metho ds

Functional Decomp osition Solution

The distributed Paster heretofore presented is mo deled on parallelization by

data decomposition In this section we will examine an alternative approach

the fPaster which in its design resembles the functional decomposition mo del

Distributed HighLevel Module Binding

The analogy with functional decomp osition is not completely accurate since

pro cesses do not communicate solely through pip elines but the fPaster do es

consist of a number of units with dedicated functionality The fPaster division

of lab or is as follows see also Figure

Finder The Finder reads the Module Table of the ob ject les and recursively

nds all mo dules which are to take part in the bind When a new mo dule

is found its name is passed on to the Allo cator pro cess and to one of the

Data pro cesses

Data Reader The Data pro cesses max one p er site read the data and co de

segments of the mo dules received from the Finder and send them to the

Writer

Relo cation Reader The Relo cation Reader pro cesses max one p er site read

the relo cation segments of the mo dules received from the Finder and send

them to the Up dater

Expression Reader The Expression Reader loads the constant expressions

and sends them on to the Expression Evaluator

Inline Reader The Inline Reader loads the inline pro cedures and sends them

on to the Inline Expander

Expression Evaluator The Expression Evaluator computes the values of the

expressions and sends the result on to the Co ders and the Inline Expander

using multicast calls

Inline Expander The Inline Expander p erforms the inlineininline expan

sions and sends the result to the Co ders

Allo cator The Allo cator receives the names of the mo dules from the Finder

and information p ertaining to the cost of co de generation from the ob ject

le headers and from the Expression Evaluator On the basis of this

information the Allo cator informs the Co ders of which mo dules to pro cess

Co ders The Co ders normally one p er site receive expression values and in

line pro cedures from the Expression Evaluator and Inline Expander and

names of mo dules for which they should generate co de from the Allo cator

Generated co de and relo cation information is passed on to the Writer and

Up dater pro cesses

Future Work

Finder

Expr Data Data Data Inline

Reader

Reader

Writer Writer Writer

Inline

Eval

Expand

Reloc Reloc

Alloc

Updater Updater

Coder Coder

Coder Coder Coder

To

To To

To

To

Writer

Writer Writer

Writer

Writer

and

and and

and

and

Updater

Updater Updater

Updater

Updater

Writer

Updater

Figure The structure of the fPaster

Writer The Writer receives co de from the Data Reader and Co der pro cesses

and writes it to the executable le

Up dater The Up dater receives relo cation information from the Relo cation

Reader and Co der pro cesses and up dates the executable le accordingly

File op erations are synchronized with the Writer

There are advantages as well as disadvantages to this conguration On

the minus side can b e noted that the system contains more pro cesses than

the dPaster and that in contrast to the dPaster where each ob ject le is

op ened only twice the fPaster op ens each le a total of seven times alb eit

concurrently On the plus side is increased concurrency and p ossibly faster

expression evaluation and inline expansion

Distributed HighLevel Module Binding

Evaluation

Having given algorithms and arguments for the design of the dPaster we will

sp end the rest of the chapter evaluating this design We will rst give the

particulars of the current implementation and then attempt an analysis of the

dPasters message complexity

Implementation Particulars

There are some discrepancies b etween the algorithms describ ed earlier in the

chapter and the actual implementation whose characteristics we will b e dis

cussing Some algorithms have not b een fully realized some have sub optimal

implementations others have b een somewhat optimized This means that in

some cases there is less messagepassing in the implementation than might b e

anticipated from the algorithms in other cases the implementation spawns

more communication than would b e necessary The ma jor sources of concern

are outlined b elow

The distributed Paster currently do es not check for cycles in the inline

graph

The termination detection algorithm is controlled by the Master instead

of Slave S This leads to an extra message p er phase and p ossibly a

1

few extra phases since in Algorithm S do es not initiate a new round

1

until it itself is passive whereas the Master has to start a new round as

so on as the previous one has failed to establish termination

From Algorithms and one would conclude that there is one Bina

ryCode call made to the Servant for each mo dule pro cessed The Slaves

are however able to save some communication by gathering enough data

to ll a packet b efore a call is made

In Algorithm the data sent b etween Slaves is in the form of sets of

tuples m i a V However a global renumbering scheme which gives

each expression e a a unique number j allows Algorithm to reduce

mi

the amount of transmitted data by sending tuples j V instead

In the present implementation of Phase the Master initializes the mo dule

map with the main program mo dule and the mo dules the main program

mo dule imp orts directly This information is taken from the main pro

gram mo dules Module Table This allows the Master to immediately

get several of the Slaves started rather than just one as in Algorithm

Evaluation

While these discrepancies are minor one asp ect of the implementation which

is likely to signicantly inuence the eciency of the dPaster is the realization

of the RPC proto col The present implementation uses an extension of Eden

brandts UDPIP implementation of Birrell and Nelsons proto col We

give some of the limitations and p erformance characteristics of the proto col

implementation

A remote call without parameters is ab out slower than Suns own

11

remote pro cedure call facility and ab out times slower than the corre

sp onding lowlevel send and receive calls The inferior p erformance can

b e attributed mainly to the fact that the Unix op erating system do es not

have kernel supp ort for lightweight pro cesses

As in Birrell and Nelsons original implementation large messages are sent

as a sequence of packets where each packet is individually acknowledged

by the server This means that the distribution of large data items such

as inline pro cedures will generate considerably more network trac than

necessary

Since the Internet proto col do es not have a true multicasting facility

the present implementation realizes multicasting in terms of the Internet

broadcast proto col

It would b e interesting to compare the current implementation of the dPaster

with one using a simpler communication proto col or one running on an op er

ating system with kernel supp ort for lightweight pro cesses Such comparisons

have not yet b een p ossible

Message Complexity

As a prelude to the empirical tests of Chapter we will examine the prop osed

algorithms in terms of their message complexity It must b e noted that although

messagepassing accounts for a large prop ortion of the executiontime of many

distributed programs there are many other factors involved

For the most part the number of messages generated by the dierent phases

is completely deterministic The exception is Phase which we will discuss

shortly The table b elow shows the number of messages generated by the dier

ent calls p erformed by the dPaster pro cesses Note that for reasons discussed

in Section there may b e fewer or more BinaryCode calls than indicated

Suns RPC proto col do es not supp ort multipacket calls and do es not allow calls to

take an arbitrary long time to complete Furthermore servers cannot handle several calls simultaneously

Distributed HighLevel Module Binding

Phase From To Call Messages

Master ) Slave FindSlaves s

Master ) Slave GlobalData s

Master ) Servant Create

Master ) Slave Pro cessMo dule m

Slave ) Master Mo duleData m

Slave ) Servant BinaryCo de m

Master ) Slave Evaluate s

Slave ) Slave IsInline ss

Slave ) Slave Values

S ) S S Detect

s

Master ) Slave Expand s

Slave ) Slave InlinePro c

Master ) Slave DoPhase s

Slave ) Master MachineCode m

Slave ) Servant BinaryCo de m

Master ) Servant Close

Phase

The analysis of the message complexity of Algorithm is far from trivial

There are three ma jor sources of trouble First of all the structure of the ex

pression DAG is determined by the relationships b etween the hidden ob jects in

the source program For example a program whose hidden types are largely

indep endent of each other will exhibit a at expression DAG whereas a pro

gram where hidden types are often implemented in terms of each other will yield

a deep er and more complex DAG Secondly the distributed expression DAG

is built more or less at random the load balancing algorithm of Section

which is resp onsible for the assignment of mo dules to Slaves takes into account

only the size of each mo dules expression DAG not its structure Thirdly Al

gorithm is asynchronous each Slave executes its own computesend cycle

indep endently of the other Slaves In other words it is not p ossible to determine

the number of values a Slave can compute during each round of the cycle since

this ultimately dep ends on the sp eed with which the other Slaves are executing

and the sp eed of messagepassing

So clearly in order to b e able to say anything ab out the message complexity

of Algorithm it is necessary to make some simplications We will assume

a synchronous version of Algorithm called Algorithm where the

computesend cycle has b een replaced by a computesendwait cycle such that

a Slave do es not start a new computation round until it has received a Values

message from the other Slaves This presupp oses that a Slave always sends

a Valuesmessage even if it has not computed any new values We will also

Evaluation

sizeT sizeT

M

2

M

1

sizeT

sizeT

M

M

3

5

sizeT

M

4

Figure Type dep endencies for a set of mo dules

assume that the assignment of mo dules to Slaves is done solely to balance the

number of expressions p er Slave ie cost m S me

Furthermore we will restrict ourselves to considering only the graphs gen

erated by the translation of hidden types Let us consider a set of mo dules

M M where each mo dule implements one hidden type T Each type

1 m

M T is implemented in terms of a subset of the other types sub ject to the

i

restriction that recursive types are disallowed The relationships b etween the

types are reected in the expression graph through the dep endencies b etween

the no des which represent the size of the M T s Figure shows an expression

i

graph generated from mo dules where each p ossibly complex expression is

represented by a triangle We will term such a schematic expression graph

a dependency graph

The number of computesendwait rounds which Algorithm will have

12

to make dep ends on two factors The height of the dep endency graph and

the mo dule to Slave assignment If for example the mo dules in Figure

are group ed on Slaves as indicated by the dotted b oxes Algorithm will

make computecycles whereas if the grouping is done according to the dashed

b oxes cycles will suce

We see that the number of computecycles p erformed by Algorithm

The height of a directed acyclic graph is the length of the longest directed path in the

graph see McKay Note that this is dierent from the diameter which is the length of

the longest geodesic the shortest path b etween any two no des

Distributed HighLevel Module Binding

Figure Connected subgraphs residing on the same Slave are coalesced into

sup erno des

is given by the height of the sup ergraph which results when all connected

subgraphs made up of no des residing on the same Slave are replaced by a

sup erno de see Figure Let H b e the height of the original expression

graph and assume that the no des of the original graph are evenly distributed

among the Slaves and that the probability of any two no des b eing connected

is constant Then the probability of a given edge connecting two no des on the

0

same site is s and the average height of the sup ergraph H sH

0

In the worst case of course H H corresp onding to the case when no two

adjacent no des reside on the same Slave

2

In Algorithm each round of the computecycle generates s messages

s Slaves each multicast message and receive s replies and therefore the

13

total number of messages generated is

2 0 2 2

s H s sH s sH

We must now ask ourselves what kind of dep endency graphs we are likely

to encounter and what the height of these graphs might b e A dep endency

graph of height zero corresp onds to the case when all hidden types are imple

0

mented in terms of primitive types whereas a H n implies that the n types

dep end on each other in a sequential fashion and that no adjacent no des in

the dep endency graph reside on the same Slave While the height of a random

DAG is approximately n McKay and the height of a random

p

n Vitter and Fla jolet our intuition tells us and this binary tree is O

is supp orted by the admittedly meager empirical results of Section

that none of these types of average graphs are very likely candidates for actual

The fact that the message size is b ounded has not b een accounted for

Evaluation

dep endency graphs Rather we susp ect that the height of most graphs will b e

b ounded by a small constant which would tend to make Algorithm run in

a small number of rounds This conjecture is corrob orated by the sample runs

rep orted in Section

The number of control messages used by the termination detection algorithm

is dep endent on the number of times the sites carrying out basic computation

enter a passive state Let us consider a simpler version of the algorithm called

Algorithm where the detection of termination is carried out by the Mas

ter rather than by one of the Slaves Furthermore let P b e the number of

i

times site S go es passive Then in the worst case Algorithm runs in

i

s s

14

min P passes using a total of s min P messages If we also use the

i i

i=1 i=1

0

synchronized evaluation algorithm we see that the Slaves will go passive H

times yielding a total of

0 2

s H s sH H s H s

control messages for the termination detection algorithm

To conclude this discussion we will calculate the number of messages gen

erated by the alternative expression evaluation algorithms The sequential

algorithm runs in two phases First sites S S send their expressions to

2 s

S This requires

1

s dns e

e

packets since each site has dnse expressions and expressions can b e sent in

e

each packet In the second phase S broadcasts the results to sites S S

1 2 s

using

sdn e

v

packets Adding up we get

s dns e sdn e

e v

In the redundant algorithm each site broadcasts its expressions to the

other s sites The total number of messages sent will b e

2

s dns e

e

The number of messages used by the depthrst algorithm is twice the

number of external edges in the graph The probability that an edge is external

is s and the number of edges is dn which gives a total of

sdn

The Master do es not start a new round of termination detection until all Slaves have

rep orted in Therefore the number of rounds is determined by the Slave which enters a

passive state the least number of times

Distributed HighLevel Module Binding

messages assuming that we do not have to worry ab out p ossible cycles

It must again b e noted that we are only taking into account the number

of messages not their size Dep ending on the implementation of the message

passing system the cost of sending a large message may b e virtually the same

as sending a small one or substantially higher Further we are assuming that

broadcasts and multicasts are implemented as one outgoing message and one

return message p er recipient and not simply as rep eated simple calls We must

also keep in mind that messagepassing only represents part of the total time

used in evaluating the expressions In algorithms and the expressions

are evaluated sequentially whereas in the other algorithms evaluation pro ceeds

in parallel on the dierent sites

Summary

There have b een several attempts at parallelizing the assembly and compilation

stages of translation but until now there have b een no attempts at parallel link

ing p erhaps b ecause linkeditors are considered to b e inherently sequential and

systemsp ecic We b elieve that the design presented in this chapter together

with the empirical results of Chapter will make it clear that distributed

mo dule binding is indeed feasible

Although the dPaster was designed sp ecically to provide ecient supp ort

for the exible abstraction primitives of Zuse we also see it as a suitable frame

work for the implementation of exp ensive intermodular optimizations At the

15

present time the only such optimizations p erformed by the dPaster is inline

expansion and the removal of unreachable pro cedures In future versions we

intend to include other optimizations such as global register allo cation and

global constant propagation We b elieve that it is p ossible to mo dify the

CET and the distributed CET evaluation algorithms presented in this chapter

to eciently compute the global data ow graph and other kinds of inter

mo dular data which will b e necessary in order to p erform such intermodular

optimizations

See for example Richardson for a discussion of various interprocedural optimization techniques

Chapter

Incremental HighLevel

Mo dule Binding

Perfection is the enemy of the dissertation

Ivan Sutherland

abstraction inversion Layering inappropriate

amounts of complex interfaces to obscure a

simple system

Reported by Alex Blakemore

Introduction

his chapter will examine the use of incremental techniques as a further

towards fast highlevel binding While the sequential techniques

T avenue

of Chapter will b e sucient for small programs the hierarchical tech

niques Section will b e useful when some parts of a program are rarely

changed and the distributed techniques in the previous chapter promise fast

turnaround times for ma jor changes to large programs they are likely to b e

outp erformed by the incremental techniques presented here when only small

and lo cal changes have b een made b etween two consecutive binds

Incremental Translation Techniques

Most translators for languages in the Algol family are batchoriented they are

essentially mathematical functions which take a number of arguments as input

and pro duce a result as output The input to the sequential binder of Chap

ter for example is the contents of the ob ject les of the b ound programs

Incremental HighLevel Module Binding

mo dules and the output is an executable le An imp ortant characteristic of

such batchoriented translators is that they are historyless they do not retain

any information b etween successive invocations This can b e seen as an attrac

tive feature since it greatly simplies the design of the translator but it fails

to capitalize on the observation that b etween two consecutive invocations of a

translation to ol it is often the case that most input arguments and hence the

ma jor part of the output value remain virtually unchanged

A dierent class of translation systems has emerged over the last decade

which rather than recomputing the output value from scratch for every invo

cation retains the result of the previous invocation and any intermediate data

used in calculating this value and uses it as a basis for the new calculation

Such techniques are termed incremental since they p erform changes to the out

put value in small steps An incremental semantic analyzer for example may

store the attributed abstract syntax tree of a mo dule and when the mo dules

source co de is changed only recalculate the aected attributes

It is imp ortant to note that translation tasks can b enet to a greater or lesser

degree from incremental techniques The most imp ortant factors to consider

may b e summarized in the following questions

Are the most frequent kinds of changes to the input arguments likely to

aect a small or a large part of the output value

What amount of state information needs to b e stored b etween invocations

in order to make the incremental up dates ecient

How do es one determine the parts of the output value which need to b e

recalculated after a particular change to the input arguments and how

exp ensive is this analysis

It is obviously the case that there exist situations in which it is faster to

evaluate a function value from scratch rather than incrementally even for very

slight changes to the input arguments It should furthermore b e clear that

if a function needs to store exorbitant amounts of state information b etween

invocations incremental up dating may b ecome infeasible Yellins INC

language for incremental computations addresses these p oints in a systematic

way by attaching to each construct the computational cost of static and incre

mental evaluation This scheme makes it p ossible to predict when a particular

argument change should result in an incremental recomputation and when eval

uation from scratch is preferable

1

Evaluation from scratch

Incremental Translation Techniques

Static semantic analysis can serve as an example of a translation task which

will often b enet from incremental evaluation In most blo ckstructured lan

guages a change to a declaration on a particular level will often only aect the

semantic correctness of uses of the declaration Hedin which can only o c

cur in blo cks on the same and lower levels Similarly a change to a statement

will often only aect the static correctness of the statement itself If changes on

lower levels are more frequent than global changes incremental up dating will

almost always b e faster than exhaustive reevaluation

Interprocedural optimization and co de generation on the other hand are

examples of translation tasks which do not exhibit this kind of lo caleect

tolo calchange b ehavior On the contrary the work by Co op er Pol

lo ck and Bivens indicates that it is a nontrivial and p otentially

exp ensive task to determine which parts of a globally optimized program are

aected by editing changes One can imagine a worstcase scenario in which

an edit to a single statement would invalidate all the collected global optimiza

tion information This might for example o ccur in an incremental version of

Walls interprocedural register allo cation algorithm

As a general rule we might thus say that if small changes to a functions

arguments result in ma jor changes to its result then incremental techniques

are unlikely to b e b enecial We also note the imp ortance of not having to

store an unreasonable amount of information b etween successive invocations

and of b eing able to easily determine which parts of a computed value will b e

aected by a change to the arguments If computing the aected parts is very

complicated and exp ensive then these computations may outweigh the b enets

of incremental up dating

Incremental Translation Tools and Environments

Translation to ols which make use of incremental techniques are often but not

always part of integrated programming environments Such environments en

courage an interactive exploratory style of programming which through the

close coupling b etween dierent incremental to ols structure editors semantic

analyzers co de generators optimizers linkers and loaders can achieve fast

n

turnaround times Examples include the R FORTRAN program

ming environment Magpie Kaplan and Kaisers incremental

distributed environment the Rational Ada environment Gan

dalf and the Synthesizer Generator We refer to Dart for a

more detailed overview

The incremental systems which are of most interest to the work we will b e

n

presenting in the next section are R which supp orts interprocedural opti

Incremental HighLevel Module Binding

mizations and Inclink which do es incremental linking We will therefore

examine these more closely

Inclink is based on a study by Linton and Quong which con

cludes that in a to olbased programming environment such as Unix on the

average only one or two mo dules have b een changed and recompiled b etween

two successive links They also note that the sizes of mo dules increase very

slowly b etween links The rst time Inclink is called it p erforms a regular link

edit session after which it enters a dormant state When it is later awakened

for a new linking session it rst rereads and inserts the ob ject co de les of

any changed mo dules inplace in the old executable Thereafter it relo cates

any new or mo died references In order to accommo date future increases in

mo dulesize some extra space called the slop is allo cated after the co de and

data segments of each mo dule in the executable program If a changed mo dule

overows its allo cated area all subsequent mo dules are shifted towards the end

of the le Hence if there are no overows Inclink runs in time prop ortional

to the size of the changed mo dules otherwise it runs in time prop ortional to

the size of the resulting program Quong rep orts that for typical relinks Inclink

runs to times faster than the UNIX batch linker ld while consuming

to times as much memory ve megabytes for a line program divided

into mo dules and using libraries Part of the sp eedup can b e attributed

to the fact that Inclink keeps symbols relo cation information and ob ject co de

in memory b etween links which obviates the need to reread the ob ject co de of

unchanged mo dules

n

R is an integrated incremental programming environment dedicated to

scientic programming in FORTRAN The environment consists of a structure

oriented module editor a program compiler a module compiler and an execu

tion monitor The program compiler is resp onsible for computing the inter

pro cedural information and passing it on to the mo dule compiler The mo d

ule compiler applies traditional co de generation and optimization techniques

aided by the collected interprocedural information to a single mo dule or a

collection of mo dules The interprocedural optimizations p erformed include

inline expansion and constant propagation but the gathered inter

pro cedural information also helps traditional optimization techniques such as

common sub expression elimination The mo dule editor aids the compilation

pro cess by building an abstract syntax tree for each compiled mo dule and col

lecting lo cal dataow information The mo dule editor also informs the program

compiler of which changes the user has made to the program and based on this

information the program compiler can determine which interprocedural infor

mation needs to b e incrementally up dated and which mo dules need to b e

n

recompiled One complication with the R approach to interprocedural op

The Incremental Paster

timization is that a mo dules ob ject co de b ecomes dep endent on the program

in which it is included In other words if a mo dule is to b e a part of several

dierent programs each program must have its own sp ecialized version of the

mo dules ob ject co de

The Incremental Paster

In this chapter we will describ e an incremental Paster called an iPaster which

works along the same lines as Inclink There is however a fundamental reason

why incremental pasting will b e considerably more dicult and p otentially

less protable than plain incremental linking the intermodule data handled

by the iPaster is much more complicated than that handled by Inclink Inclink

works with one kind of intermodule data only the addresses of the routines and

data items exp orted by each mo dule When the address of a routine changes

Inclink will simply overwrite all references to the routine with the new address

The iPaster on the other hand will have to accommo date not only changes to

the addresses of compiled routines but also changes to the values of expressions

and the co de of inline pro cedures We give two examples

A change to the realization of an inline pro cedure P will make it necessary

for the iPaster to retrieve the unexpanded co de of all inline pro cedures

which call P and any which call them etc and redo all the expansions

All Phase pro cessing of every pro cedure which calls any of the aected

inline pro cedures will have to b e redone If the new version of P is larger

than the old one the size of the co de of the deferred pro cedures is likely

to increase

A change to the realization of an abstract type T which aects its size

may have farreaching consequences All pro cedures which reference a

variable of type T or a variable whose type dep ends on T will have to b e

repro cessed If one of these pro cedures happ ens to b e an inline pro cedure

we have a situation like the one describ ed ab ove

Our intention therefore is for the iPaster to b e employed primarily when

there has b een a small number of lo cal changes to the mo dules which make up

the program and to defer ma jor up dates to the dPaster

Preliminaries

Similarly to Inclink the iPaster will keep note of the state of the b ound program

b etween sessions The saved information will include the names of the mo dules

Incremental HighLevel Module Binding

which make up the program the lo cation of their ob ject les and the addresses

in the executable of their co de and data segments the expressions and their

values the inline pro cedures in their expanded as well as unexpanded form the

address and relo cation information of each deferred pro cedure and two use

maps which indicate the CET expressions and the deferred and inline pro cedures

which will b e aected by changes to individual expressions Contrary to Inclink

which keeps the mo dules ob ject co de in memory b etween links the iPaster will

not keep the Binary Code Data and Deferred Code Sections in memory

since we b elieve this will b e prohibitive for large programs

The iPaster will maintain the four phases of the sPaster with some mo di

cations Phase will load the expressions and inline pro cedures of the changed

mo dules and up date their binary co de and data segments in the executable

Phase will recalculate the values of aected expressions and expand inline

pro cedures as necessary Phase will use the callgraph and the usemaps to

determine the deferred pro cedures which have b een aected and will read and

repro cess them To allow for the growth of individual deferred pro cedures slop

will have to b e app ended to the co de of each pro cedure Slopmanagement will

b e discussed further in Section

The incremental nature of the iPaster makes it well suited to situations in

which a program has undergone a small number of lo cal changes The algo

rithms presented b elow will b e able to handle changes with global consequences

as well but to make the iPaster design simple and ecient we institute three

restrictions Between two consecutive invocations of the iPaster

no new mo dules may b e added to the program

no mo dules may b e deleted from the program

no mo dule sp ecication units may b e changed and recompiled

The last p oint is essential since it guarantees that the crossmo dule ref

erences which all go through the CET do not change b etween runs of the

iPaster For example we can b e assured that b etween two consecutive calls

to the iPaster e size will always refer to the same type A consequence of

mi

the rules ab ove is that the iPaster is restricted to changes to the realization of

abstract items and the hidden part of semiabstract items ie changes which

o ccur in the implementation unit We do not see this as unduly restrictive since

we b elieve that these are the kinds of changes which o ccur most frequently

We have already discussed the ma jor advantages that Inclink has over the

iPaster but we have yet to mention the one asp ect in which the iPaster has the

upp er hand while Inclink like other linkeditors has to cater to the needs of

many dierent compilers the iPaster works in symbiosis with one compiler zc

The Incremental Paster

and can demand that it do some extra work Determining the changes which

have b een made to a mo dule since the last bind is an imp ortant and time

consuming task for the iPaster but it can b e made more ecient with some

help from the compiler We therefore add an integer CET attribute checksum

to all expressions asso ciated with deferred and inline pro cedures and make

the compiler compute it in such a way that it dep ends on every part of the

corresp onding Zco de blo ck To determine if a pro cedure has b een altered

since the last bind the iPaster simply checks if the old and the new checksum

attributes dier In a similar vein we give each mo dule M the CET expressions

checksum e d checksum and e r checksum which are checksums e b

M M M

for the Binary Code Data and Relocation Sections resp ectively

Lastly we adopt the notion of a dirty expression pro cedure and mo dule

An entity is said to b e dirty if it needs to b e repro cessed during an invocation

of the iPaster or if in the case of mo dules it contains an ob ject which needs to

b e repro cessed For convenience we give each dirty CET expression the marker



as in e a

mi

The Phases of the iPaster

We are now ready to give the iPaster algorithms The main program Al

gorithm starts with a regular sequential bind by using the algorithms

developed for the sPaster It then enters a dormant state and when awakened

prompts the user for mo dules which have changed since the last run These

mo dules are pro cessed by the iPasters own Phases to to build a new ver

sion of the executable le after which the iPaster reenters its dormant state

and waits for the pro cess to rep eat

In the algorithms b elow we will make use of two maps E and P which store

the expressions and pro cedures which will b e aected by changes to individ

ual CET expressions The maps are built as part of Phases and of the

rst sequential bind and are kept uptodate during the subsequent incremental up dates

Incremental HighLevel Module Binding

Algorithm iPaster

Let E b e a map from CET expressions to sets of expressions such that if

e a f e b g E then e b , f e a is a CET

mi nj nj mi

expression

Let P b e a map from expressions to sets of pro cedures such that if

e a f e g P then e a is referenced in the Zco de

mi nj mi

blo ck of the pro cedure asso ciated with expression e

nj

Perform the sPasters Phase Algorithm but add slop to the Data

and Binary Code Sections

Perform the sPasters Phase Algorithm but add slop to every gen

erated variable structured constant and ob ject template Also construct

the E map from the CET and the P map from the inline pro cedures

Perform the sPasters Phase Algorithm but add slop to the co de

of every deferred pro cedure Also construct the P map from the deferred

pro cedures

Perform the sPasters Phase

Execute the following co de

LOOP

Perform iPasters Phase Algorithm

Perform iPasters Phase Algorithm

Perform iPasters Phase Algorithm

Perform iPasters Phase

END

The Incremental Phase

One of the iPasters most imp ortant Phase tasks is to detect the mo dules

which have changed since the last run Quong notes that this is dicult to

do eciently esp ecially in a network le system and that the obvious metho d

comparing the timestamp of each mo dules ob ject le to the last time of

mo dication is much to o inecient We therefore opt for a solution similar

to Quongs namely to leave it to the user to inform the binder of the mo dules

which have changed

The Incremental Paster

Algorithm Phase

NewModules Prompt the user for changed modules

FOR EACH n 2 NewModules DO

Reopen ns object le and load the preamble

Reload ns Expression Section

Mark al l e a whose denition is dierent from olde a dirty

ni ni

FOR ALL i SUCH THAT e inline , T ^

ni

olde checksum6e checksum DO

ni ni

Load inline procedure e from ns Inline Section

ni

Mark e inline dirty

ni

END

Update the E map from the CET and the P map from the inline procedures

checksum6e d checksum THEN IF olde d

n0 n0

If ns Data Section ts in its old spot in the executable le

insert it inplace Otherwise append it to the end of the le

END

Same as above but for ns Binary Code Section

Same as above but for ns Relocation Section

Close ns object le

END

When a new Data or Binary Code Section overows its slop region it is

necessary either to shift the following co de to make ro om for the new section

or as in Algorithm to app end the section to the end of the executable

le Quong rejects the latter metho d on the grounds that it makes it im

p ossible to separate the co de and data segments in order to make the co de

writeprotected While this is a valid ob jection we will ignore this complication

in this presentation

The Incremental Phase

We are now ready for the incremental Phase which is very similar to its

sequential counterpart

Algorithm Phase



a as well as any expression Evaluate every dirty CET expression e

mi

which according to the E map transitively dep ends on a dirty expres

sion

2

The current paster implementations pro duce the obsolete Unix impure format which mixes

co de and data and hence app ending to the end of the executable le causes no problem

Incremental HighLevel Module Binding

Rewrite any changed structured constant data ob jects such as ob ject

type templates and literal records strings sets or arrays to their original

sp ots on the executable le or to the end of the le if they do not t

Up date any changed relo cation information

Fill in newly calculated expression values from the CET in the Zco de of

any according to the P map dirty inline pro cedure

Redo inline expansions as necessary according to the E map

The only part of Algorithm which may need further explanation is the

incremental inline expansion The main question is whether we keep the results

of intermediate expansions b etween consecutive runs of the iPaster or whether

we only store the original inline routines and the nal results In the former

case which is memory exp ensive we only have to redo the expansion of dirty

pro cedures while in the latter case all expansions in all pro cedures which call

a dirty pro cedure have to b e redone

While there exist a variety of incremental algorithms for keeping a set of

expressions and the values they compute uptodate in resp onse to changes

see for example Alp ern Cohen and Yellin they are overly

general for this application In particular it would b e unnecessarily exp ensive

for the iPaster to keep the values uptodate after each new expression is read

and inserted in the CET during Phase Rather it is more ecient rst to

p erform all necessary changes and then to recompute any aected values

The Incremental Phases and

As we have already mentioned for the sake of memory conservation the saved

state of the iPaster do es not include deferred pro cedures Therefore it b ecomes

necessary to reread part of the Deferred Code Section of dirty mo dules

ie mo dules which contain one or more dirty referenced deferred pro cedures

The iPasters Phase is otherwise similar to that of the sPaster

Algorithm Phase

PROCEDURE Phase

FOR EACH for every dirty mo dule n DO

Reopen ns object le

FOR ALL i SUCH THAT e is a dirty referenced procedure DO

ni

Read e from ns deferred code section

ni

Fil l in new ly calculated expression values from the CET

Expand calls to inline procedures

Summary

IF some calls were expanded THEN

Recompute basic blocks and nextuse information

END

Optimize

Generate machine code

Store generated relocation information

Write the code to the executable le at the old position

or append to the end of the le

END

Close ns object le

END

END Phase

The iPasters Phase is also similar to the sPasters only there is no need

to up date addresses which have not changed during the present invocation

Slop Management

Quong examined and evaluated a number of sophisticated slop management

strategies but settled for a simple algorithm which allo cates bytes of slop

to each data and co de segment initially and after each overow Since the

iPasters unit of incrementality is the pro cedure rather than a mo dule segment

as in the case of Inclink we b elieve that in our case a slightly more complex

strategy will b e necessary We prop ose that the slop assigned to a pro cedure P

th

the n time it overows its allo cated area b e given by

P

Q

ik

i

C slop if n

i

P

slop

n

P

if n C slop

n

where Q Q are the inline pro cedures called by P In other words

k

the amount of slop initially given to a pro cedure is the sum of the slop given to

the inline pro cedures it calls and a small constant C Each time P overows its

allo cation its slop is increased by a small constant The reasoning b ehind this

scheme is of course that the size of the binary co de of a deferred pro cedure is

more likely to grow quickly if the pro cedure dep ends directly or indirectly on

a large number of inline pro cedures than if it only calls noninline pro cedures

Summary

Contrary to Quongs Inclink which under almost all reasonable circumstances

is faster than its batch counterpart ld we conjecture that the iPaster will not

Incremental HighLevel Module Binding

always outp erform the sPaster and the dPaster The reason is of course that

for a language like Zuse with exible encapsulation facilities and a translating

system like zts which emphasizes global optimizations it will often b e the case

that a seemingly small and lo cal change such as adding a eld to the hidden

part of an extensible record type or changing the realization of an abstract

inline pro cedure will have farreaching global consequences We envision a

system which based on an analysis of the changes made to the program invokes

either the iPaster if the changes are small and lo calized or the dPaster if the

changes have global eect Such as system would have the p otential to b e fast

regardless of the kinds of changes which are made b etween binds

A p ossibility which has yet to b e discussed is to instrument the dPaster with

d

incremental capabilities to form a Paster a distributed incremental paster

i

The question is of course whether there exist any circumstances under which

such a contraption would b e likely to outp erform b oth the dPaster and the

iPaster We b elieve at this p oint that the answer is no Distributed appli

cations in general need to work on large data sets in order to make up for their

high startup costs Incremental applications on the other hand p erform b est

when given a small amount of data to pro cess It would therefore seem that the

combination of distributed and incremental highlevel binding is unlikely to b e

protable

The iPaster diers from other incremental translators for mo dular languages

which engage in intermodular optimization in that it hamp ers neither mo du

larity nor separate compilation A Zuse mo dule can easily b e part of several

programs at the same time without having to b e recompiled in contrast to

n

the R system in which each program needs its own compiled version of each

FORTRAN mo dule

Chapter

Evaluation

2 2 5ISM Caving in to a target marketing

strategy aimed at oneself after holding out for a

long period of time Oh all right Ill buy your

stupid cola Now leave me alone

Douglas Coupland

Introduction

he following statement summarizes some of the ob jections which may b e

against the translating system designs presented in the previous

T raised

chapters

The b enets aorded by zts over traditional translating systems

such as improved control over encapsulation improved global co de

quality and the reduced need for recompilations are limited and

do not warrant either the addition of another complicated piece

of software or the extra linktime cost

We have already noted that the b enets of exible encapsulation can only

b e conclusively determined by putting it to the test in a wide variety of pro

gramming situations There is still some controversy over the p otential prots

of intermodular optimizations see Chow Co op er and Richard

son and one might argue that other metho ds for reducing the num

b er of unnecessary recompilations see Co op er Tichy Pollock

and Schwanke may b e more eective than the one presented here In this

chapter we will put these and all other considerations aside and concentrate on

Evaluation

the last and in our mind the most serious ob jection that ztsstyle high

level binding is inherently slow and will b e unsuitable when applied to large

programs

This chapter will thus b e concerned with determining the cost of pasting

when applied to programs of dierent characteristics In Section we develop

a mo del of mo dular programs which allows us to generate a large family of

Zuse programs diering in size structure and degree of encapsulation In

Section we describ e the exp erimental metho ds used in Section where we

study the b ehavior of zts when applied to a variety of programs generated by

the mo del We also observe its b ehavior when applied to some real programs

The chapter concludes with a summary Section

A Mo del of Mo dular Programs

There are several dierent ways of evaluating the eciency the time and space

used and eectiveness the sp eed and size of the pro duced co de of translation

systems for a language

Run each translation system on a representative collection of wellknown

programs in the language and measure the resources used by the translator

and the quality of the co de it has pro duced The programs should ideally

b e of varying size from dierent application domains and of dierent

authorship

Collect a representative set of programs and use this set to construct

a mo del which adequately describ es salient asp ects of programs in the

language This mo del might include b oth quantitative and qualitative

measures Run each translation system on a set of programs constructed

so that their statistical prop erties reect those suggested by the mo del

and evaluate the results as in the previous case

In the absence of a large collection of real programs one might still b e

able to construct a mo del such as the one ab ove Such a mo del will have

to b e based on the assumption that the language lends itself to a certain

style of programming and that this style will b e the most prevalent

The rst approach is the one most commonly employed and for many lan

guages there exist wellknown sets of programs which have b een used to test

the translation times and co de quality of a variety of compilers The advantage

of the second metho d however is that one is not restricted to testing only the

programs at hand but can generate as many programs as is deemed necessary

A Model of Modular Programs

The second metho d would make it p ossible for example to test the translators

b ehavior with the growth of program size This is done by generating programs

with statistical prop erties similar to the ones in the original set but which are

much larger than any of these programs

The rst metho d assumes the existence of an adequate sample of real

programs while the second metho d requires at the very least a b o dy of knowl

edge of the structure of such programs Unfortunately relatively little is known

ab out the statistical prop erties of programs Knuth studied FORTRAN

programs Wanvik presents the results of a study of a fairly small CHILL

program Berry contains some rather trivial statistics of C programs Bori

son gives quantitative data on three C and three Ada programs and Linton

and Quong studied changes in mo dule and program size over successive

compiles and links

It has not b een p ossible to use either of the rst two metho ds in the evalu

ation of the Zuse translating system First of all Zuse is a new language and

only a few programs have b een written from scratch in Zuse Furthermore

apart from Borisons and Linton and Quongs work which provide some insight

into the average size of mo dular programs there have not b een any studies on

the prop erties of mo dular programs In particular there is no b o dy of work

which sheds light on the way abstract and semiabstract types constants and

inline pro cedures concepts which are imp ortant in Zuse and whose use must

b e an integral part of a mo del of Zuse programs might b e used in real pro

grams

This evaluation of the Zuse translating system must therefore rely on the

third of the approaches outlined ab ove we will construct a simple and arti

cial mo del of Zuse programs based on our intuitions of what programs will

probably b e like Programs will then b e generated based on the mo del using

reasonable values of the mo dels parameters These programs will b e sub

mitted to the Zuse translating system and the b ehavior of the system will b e

rep orted We cannot argue that the generated programs are representative of

real Zuse programs but we will argue that when submitted to the translating

system they will provide some insight into the systems b ehavior

The Mo del

One can think of many interesting quantitative and qualitative measures of

mo dular programs Examples include the average number of mo dules in a

program the average number of exp orted items in a mo dule the structure

of the imp ort graph formed by the mo dules knary tree DAG sparse full

etc and the structure of the graph formed by the exp orted items such as the

Evaluation

callgraph of exp orted pro cedures or the dep endency graph of exp orted types

The primary parameters of the simple mo del we prop ose are the structure

of the imp ort graph particularly its height the mo dulesize the number of

exp orted items p er mo dule and the itemsize the number of references each

item makes to imp orted items In order to exclude programs containing illegal

circular declarations see Section we restrict the imp ort graph to a hierar

chical levelorganized structure where each mo dule may only imp ort mo dules

on lower levels Although the design of the mo del is primarily motivated by

our intuition of what a representative Zuse program should lo ok like it is also

inuenced by the translating system designs of the previous chapters Since the

running times of some of the prop osed algorithms are prop ortional to the height

of the imp ort graph some to the height of the CET and others to the size and

number of deferred and inline pro cedures it is imp ortant that the mo del allows

these parameters to b e varied easily

Formally a program is describ ed by an tuple T

T items items items ref work imp ort di mo dules

conc

semi

abs

The rst three entries of a T tuple describ e the number of concrete semi

abstract and abstract items generated by the mo del The next two entries

describ e the structure of the generated items and the last three entries deter

mine the structure and size of the mo dule imp ort graph See Table for a

more detailed description Each value in a T tuple may take on any one of

four dierent forms a a b N m and a N m b where a and b are

integers and m and are real numbers These dierent forms allow us either

to pro duce programs which are almost completely deterministic by using the

aform or to introduce a level of nondeterminism by using any of the other

three forms which select values using either a rectangular or normal distribu

tion Table b elow shows how an actual value for a particular mo dule or

item is selected

As we can see our mo del is delib erately vague on the sub ject of the real

ization of the exp orted items In particular the mo del do es not say anything

ab out the structure of the exp orted types or the actions which should b e taken

by the exp orted pro cedures It would of course b e p ossible to parameterize

the mo del with resp ect to these and other asp ects of mo dular programs static

nesting depth of pro cedures lo cal items variable usage etc but we have for

the sake of simplicity elected not to do so Instead we will make all types

and structured constants records make all scalar constants integers and limit

the statements in pro cedures and mo dule b o dies to pro cedure calls and a set

of simple assignment lo op and conditional statements op erating on integers

While this simplistic view of the internal structure of mo dules may generate

A Model of Modular Programs

items A tuple type variable const const pro c

conc struct

scalar

notinline

pro c describing the number of concrete items of each

inline

kind each mo dule should exp ort

items Similar to items but for semiabstract items

semi conc

items Similar to items but for abstract items items also

conc

abs abs

has an additional entry mo duleb o dy giving the number of

mo dule b o dies b etween and p er mo dule

ref A tuple similar to the item tuples describing the number of

references an item may make to similar imp orted items

work A tuple pro c pro c mo duleb o dy describing the

notinline inline

amount of work in number of statements p erformed by

these items

imp ort The total number of mo dules a mo dule may imp ort

di The p ossible dierences in levels b etween an imp orter and

an exp orter

mo dules A list l l l of the number of mo dules on

L1 L2 0

each of L levels Level L contains only the toplevel

program mo dule and hence l Level contains only

L1

leaf mo dules ie mo dules which do not imp ort anything

Table Semantics of the entries in a T tuple

a Constant distribution

a b Rectangular distribution of integer values b e

tween a and b

N m Integer normal distribution with mean m and

standard deviation

a N m b Truncated normal distribution

Table Dierent forms of values in a T tuple

Evaluation

1

programs inadequate for any tasks other than the one at hand for our present

purp oses it works well

An Example

2

A simple example is given by the tuple T b elow

0

T

0

N

Any program P generated from T will contain b etween six and eight mo d

0

3

b etween three and ve leaf mo dules ules one main program mo dule M

M M M M M and two mo dules on the intermediate

M Each mo dule except M will exp ort abstract items level M

an abstract type and an abstract inline pro cedure and may imp ort mo dules

on the immediately lower level only Each mo dule may imp ort b etween and

mo dules The number of imp orted types referenced by the realization of an

exp orted abstract type is selected using a truncated normal distribution with

mean and standard deviation The number of pro cedure calls in the inline

pro cedures is given by the same distribution In addition to these calls ex

tra work b etween and statements is added to each inline pro cedure and

mo dule b o dy Figures and illustrate a program generated from T

0

Exp erimental Approach

In preparation for the next section in which the results of the sPaster and

dPaster test runs will b e analyzed this section will present statistics regarding

the input programs and describ e our exp erimental setup

Exp erimental Conguration

The tests were carried out on the network of Sun workstations shown in Fig

ure The sites were connected by a Mbs Ethernet network and the work

stations ran the Unix op erating system and the NFS distributed le system

During the tests the workstations were otherwise unloaded and the Slaves load

1

Such as evaluating mo dule compilers or structureoriented editors for example

2

 X where X is any one of the four value forms describ ed ab ove is an abbreviation of a

tuple consisting of all X s

3

Each mo dule is named M x y where x is the level of the mo dule and y is the mo dules

sequence number within its level

Experimental Approach

M

M

M

M M M M

Figure A program generated from T

0

balancing function see Section and the discussion in Section b elow

was selected so as to balance Phase pro cessing cost m s mc Each

program was pro cessed at least times and the time sp ent in each phase as

p erceived by the Master and each of the Slaves was recorded All times given

are averages of all the runs in wall time seconds

One problem we had to deal with during the tests was client le caching see

Section A program which is run several times on the same input data will

often run faster the second and following times than the rst time The reason

is that the second time the le cache already contains the programs load image

and any les read by it during the previous run and therefore these les will not

need to b e read from secondary storage While this is in general the preferred

b ehavior of a distributed le system it renders unreasonable the results of test

runs of IOintensive programs such as the sPaster which presumably will never

b e run more than once on the same input data We therefore emptied the

lecache on all machines prior to each test

dPaster Pro cess Allo cation

During the tests the dPaster pro cesses Master Servant and Slaves were

placed on the available sites so as to minimize the total executiontime The

six p ossible allo cation strategies are shown in Table For a small number of

sites StrategyB where the Master and Servant each share sites with

Evaluation

PROGRAM M

IMPORT M M

TYPE T RECORD x M T x M T

INLINE PROCEDURE I REF f T

I f x M I fx END I BEGIN M

VARIABLE f T

BEGIN I f END M

SPECIFICATION M IMPLEMENTATION M

TYPE T IMPORT M M M

TYPE T RECORD

M T PROCEDURE I x

x REF f T M T

END M x M T

INLINE PROCEDURE I REF f T

VARIABLE i INTEGER

BEGIN

I fx M

IF i THEN i ENDIF

M I fx

CASE i OF i i ENDCASE

END I

END M

SPECIFICATION M IMPLEMENTATION M

TYPE T TYPE T RECORD x CARDINAL

PROCEDURE I

REF f T INLINE PROCEDURE I REF f T

END M BEGIN fx END I

END M

Figure Three of the mo dules of the program whose imp ort graph is given

in Figure

Experimental Approach

Strategy Master Servant Slaves Preferred

StrategyA Site Site Sites For or more sites

StrategyB Site Site Sites For sites

StrategyC Site Site Sites

StrategyD Site Site Sites For site

StrategyE Site Site Sites

StrategyF Site Site Sites

Table The six dierent allo cation strategies for the dPaster pro cesses The

table shows that StrategyA which is the preferred strategy for or more

sites allo cates the Master and the Servant each on their own sites and the

Slaves on the remaining sites StrategyB on the other hand places the

Master and the Servant on sites together with one Slave

one Slave turned out to b e the b est one while StrategyA where the Mas

ter and Servant each have a site of their own is to b e preferred for six sites or

more The reason is that the amount of communication b etween the Slaves and

the Master and the Slaves and the Servant increases with the number of slaves

and at some p oint it b ecomes advantageous to reserve two sites for exclusive

use by the Master and Servant rather than using those sites for optimization

and co de generation

It is likely although we are not able to present any hard evidence for

this that the breako p oint b etween StrategyA and StrategyB the

number of slaves when StrategyA b ecomes b etter than StrategyB is also

dep endent on the program b eing b ound The preferred allo cation strategies in

Table were arrived at through tests using the input program sPasterB see

the next section and we b elieve that slightly dierent results would have b een

obtained using some other input However even if that were to b e the case

the dPaster could not itself determine the optimal allo cation p olicy for a given

program since that would require it to know details of the program even before

having lo oked at it It is therefore reasonable to use a typical program as

we have done to determine a b est average pro cess allo cation strategy

Characteristics of the Test Programs

In order to b e able to relate the running time of the zts to dierent charac

teristics of the input program we have instrumented the compiler and binder

with op erations to determine some statistical prop erties of the program b eing

Evaluation

M Number of mo dules

LO C Total size in thousands of lines of co de

S Total number of statements in thousands

P Number of pro cedures

T Number of abstract types

a

C Number of abstract constants

a

I Number of abstract inline pro cedures

a

P Number of referenced deferred pro cedures pro cedures for

d

which co de is generated during Phase

S Number of deferred context conditions

d

I Number of inline expansions p erformed during Phase

2

I Same as I but for Phase

3 2

I Percentage of dynamic calls expanded for typical input

c

E Number of constant expressions excepting callgraph

attributes

H The height of the CET excepting callgraph attributes

e

H Average height of all CET no des excepting callgraph

e

attributes

H The height of the callgraph excepting recursive no des

c

H Average height of all callgraph no des

c

H The height of the inline callgraph

i

H Average height of all inline callgraph no des

i

Table Semantics of the measurements given in table

translated Our intention is also to use these functions to provide insight into

the average and extreme prop erties of real programs This information can then

b e used to determine reasonable values for the parameters of the mo del pre

sented in the previous section or to design a more accurate mo del Table

gives the list of prop erties that zts can currently calculate along with their

abbreviations

At this p oint in time there only exist a few real programs written in Zuse

The only large program is the sPaster itself which has b een translated more

or less mechanically into Zuse from the language in which it was originally

written Mo dula Since the sPaster was not written in Zuse from scratch

we cannot argue that it is a go o d approximation of what a real Zuse program

might lo ok like Nevertheless ve versions of the sPaster were constructed

diering in their amount of encapsulation and inlining and used as input to

Empirical Results and Analysis

the sPaster and dPaster A summary of the characteristics of each version is

given in Table

In addition to the ve versions of the sPaster we constructed four articial

P P R from the mo del in the previous programs P

P P all consist section The programs in the P family P

of mo dules organized in and levels resp ectively Their dominant

characteristics are a rectangular imp ort graph a large amount of encapsulation

and inlining large mo dules and a large amount of work p er pro cedure R

diers from these programs by b eing larger more mo dules more lines of co de

but having fewer and smaller inline pro cedures and a much shallower inline

callgraph Furthermore while the mo dules in the P family of programs are

mo dules can vary signicantly in size and structure all very similar R

Table lists the T tuples of each program More detailed statistics can b e

found in Table

Empirical Results and Analysis

We conclude this chapter by repro ducing the running times of the sPaster and

4

the dPaster for ve versions of one real program the sPaster itself and four

articial programs generated by the mo del presented in Section We also

analyze this empirical data and compare it to the analytical results from earlier

chapters

Total Binding Times and Sp eedup

Figure shows the executiontimes wall time in number of seconds for the

dPaster and its sequential counterpart when binding the ve versions of the

sPaster and the four articially generated programs All graphs in this section

use the following legend unless otherwise sp ecied

sPaster-A sPaster-B sPaster-C sPaster-D sPaster-E m2l P129_16 P129_4 P129_8

R257_8

4

The hPaster and iPaster have regrettably not b een implemented at this p oint

Evaluation

Program M LO C S P T C I

a a a

sPasterA

sPasterB

sPasterC

sPasterD

sPasterE

P

P

P

P

Program P S I I I

2 3 c

d d

sPasterA

sPasterB

sPasterC

sPasterD

sPasterE

P NA

P NA

P NA

R NA

Program E H H H H H H

e e c c i i

sPasterA

sPasterB

sPasterC

sPasterD

sPasterE

P

P

P

R

Table Source co de statistics regarding ve versions of the sPaster and

four articially generated programs See table for an explanation of the

dierent measurements The slight dierence in average callgraph height H

c

b etween the dierent sPaster versions is due to intramodular inlining Lo cal

nonexp orted inline pro cedures do not take part in the binding pro cess and

do not show up in these statistics

Empirical Results and Analysis

P

N N

N N N

N N

N N N

N

N

P

P

R

Table T tuples for the articially generated programs An amp ersand

indicates that a tuple entry is the same as the one in the previous tuple

For comparison we also give the executiontime of the SUN Mo dula linker

ml a prelinker which calls the UNIX systems linker ld as part of the linking

pro cess when linking the sPaster Figure shows the sp eedup of the dPaster

relative to the sPaster and relative to itself running on one site It is evident

from Figure that the sp eedup of the dPaster varies with the amount of en

capsulation and inlining sPasterC a completely concrete program results

yield a sp eedup of and in no sp eedup at all while sPasterB and P

resp ectively

It is interesting to note that for all ve versions of the sPaster the dPaster

given enough sites will always outrun ml This is of course an unfair compar

ison since the two programs dier in many ways ml is a prelinker while the

dPaster p erforms every part of the linking pro cess itself ml like most system

linkeditors has to resolve external symbols through name lo okup while zts

binds all names already at compiletime The dPaster p erforms intermodular

Evaluation

600

100

400 Time (Sec)

50 200

S 2 4 6 8 10 S 2 4 6 8 10

Sites Sites

Figure On the left is shown the running time of the sPaster p oint S and

the dPaster running on sites when binding ve versions of the sPaster

itself The timing for the SUN Mo dula linker ml linking the sPaster is also

given To the right is shown the total running time of the sPaster and the

dPaster when binding the four articial programs Only sites were available

for the test runs of the generated programs which explains the missing data

p oints

optimizations while ml only do es relo cation Furthermore the SUN Mo dula

translation system pro duces b etter lo cal co de than zts although intermodular

inlining can in some cases make up for zts p o or co de quality sPasterB is

only slower than the sPaster pro duced by the SUN Mo dula system with

full optimization

Also interesting is that the dPaster is never more than marginally faster

than the sPaster for programs with no encapsulation and inlining such as

sPasterC This would seem to indicate the infeasibility of distributed link

editing However the current implementation of the dPaster is tuned to p erform

well for programs with large amounts of encapsulation and inlining and its p o or

p erformance for other kinds of programs is by no means a pro of that some level

Empirical Results and Analysis

4 5

4 3

3 2

Speedup 2

1

Self Relative Speedup 1

0 0 S 2 4 6 8 10 0 2 4 6 8 10

Sites Sites

Figure Sp eedup of the dPaster relative to the sPaster left and selfrelative

sp eedup of the dPaster right

of sp eedup cannot b e achieved by some other techniques

Figure aords a further imp ortant observation namely that sPasterD

which has no encapsulation but full intermodular inlining and sPasterB

which has full encapsulation and intermodular inlining result in very similar

binding times The conclusion must b e that a system which p erforms inter

mo dular inlining might as well supp ort full exible encapsulation since the

extra translationtime cost will b e close to negligible

Figure shows how the total sPaster time is distributed over the four

phases for some of the test programs Figure shows the same results but

for the dPaster running on sites One thing we notice from these gures is

that the dPaster sp ends a prop ortionally smaller part of its time on Phase

and than the sPaster but more of its time on Phase and This is b ecause

the dPaster achieves higher levels of sp eedup for Phase and than for Phase

and We will examine this further in the next four sections which we devote

to discussions of each phase in turn

Figure also gives the time sp ent by the sPaster in each of the four phases

but this time we also show the p ercentage of the time that is actually sp ent

p erforming useful computation This data is interesting b ecause it indicates

whether the sPasters fourphase design is a reasonable one or if a dierent

design which would interleave the actions of some of the phases in order to take

advantage of unused CPU p ower would b e likely to p erform b etter One could

Evaluation

100

80

60

40 Percent (%) of total time

20

0 sPaster-A sPaster-B sPaster-C P129_16 R257_8

Phase 1 Phase 2 Phase 3

Phase 4

Figure Percentage of the total sPaster time sp ent in each of the four phases

Empirical Results and Analysis

100

80

60

40 Percent (%) of total time

20

0 sPaster-A sPaster-B sPaster-C P129_16 R257_8

Phase 1 Phase 2 Phase 3

Phase 4

Figure Percentage of the total dPaster time sp ent in each of the four phases

when running on sites

Evaluation

for example imagine a design in which Phase runs in parallel with Phase

evaluating expressions and expanding inline calls in inline pro cedures as these

b ecome available Or one might let Phase execute concurrently with Phase

and Phase much as is done in the dPaster We can deduce from Figure

that for some programs such designs may indeed p erform b etter than the current

one but only marginally so For example the sPasters Phase sp ends most of

5

and it may prove b enecial its time in an idle state when pro cessing P

to sp end this idle time doing Phase or Phase computations Similarly very

large programs such as R may have enough idle time during Phase due

to the writing of the generated co de that it would pay o to p erform relo cations

Phase in parallel However we b elieve that the greatest p otential gain will

come from concurrent execution within Phase The source of almost all the

idle time in Phase is lo cating and op ening new ob ject co de les and it could

b e an improvement if this was done in parallel with the pro cessing of the data

read from the current le

One must however keep in mind that there may b e instances when the

overhead of having several concurrent threads of control may very well cancel

out much of the advantage of reducing the sPasters idle time Running Phase

in which Phase and in parallel may work well for programs like P

contains much idle time but is likely to p erform less well for programs like

6

sPasterB in which Phase is completely CPUb ound

Analysis Phase 1

Figure shows that during Phase the sPaster sp ends most of its time

searching for and op ening ob ject les This is particularly true when the sPaster

is given long searchpaths or when the ob ject les reside deep in the directory

hierarchy this was the case for the sPaster versions For programs such as

sPasterC which have none or very little encapsulation and inlining and hence

few deferred pro cedures the copying of co de and data from the ob ject les to

the resulting executable le the in Figure also b ecomes

Process Dataentry

timeconsuming Similarly loading the Expression Section is exp ensive for

that contain a signicant amount of encapsulation programs such as P

The Phase timings and the dPasters Phase sp eedup are given in Figures

and

5

The reason is that P contains a large number of abstract structured constants which

once they have b een constructed have to b e written to the executable le The idle time is

simply sp ent waiting for these writeop erations to complete

6

sPasterB is the program for which the current implementation was tuned which explains

why it has the least idle time of all the test programs

Empirical Results and Analysis

100

80

60

40 Percent (%) of total time

20

0 sPaster-A sPaster-B sPaster-C P129_16 R257_8

User Time System Time Idle Time

Figure Percentage of the total sPaster time sp ent in each of the four phases

Each bar is divided into four parts one p er phase with Phase at the b ottom

Phase at the top Each phase is divided into three categories

and User

CPU time used by the phase the CPU time sp ent in

Time the System Time

the time sp ent in a susp ended system routines and the

op erating Idle Time

state waiting for system routines to nish executing

Evaluation

50

40 100

30 Time (Sec)

50 20

10 S 2 4 6 8 10 S 2 4 6 8 10

Sites Sites

Figure Phase pro cessing times

2 5

4

3 1

Speedup 2

Self Relative Speedup 1

0 0 S 2 4 6 8 10 0 2 4 6 8 10

Sites Sites

Figure Sp eedup of Phase

Empirical Results and Analysis

100

80

60

40 Percent (%) of Phase 1 time

20

0 sPaster-A sPaster-B sPaster-C P129_16 R257_8

Open Files Read Module Sec. Load Expressions

Process Data Load Inline

Figure Breakdown of the sPasters Phase activity Consult Algo

on page for a description of the various op erations The

rithm Open

includes the time for lo cating the les and loading the preambles

Filesentry

entry is the combination of all the Algorithm op erations

The Process Data

concerned with the Data Binary Code and Relocation Sections

Evaluation

Analysis Phase 2

Already in Chapter we noted that Phase is the dPaster phase that is the most

challenging to parallelize This is also evident from Figures and which

show that in most cases the distributed Phase is b etween a factor and

slower than its sequential counterpart The reason is of course that in addi

tion to evaluating expressions and expanding inline pro cedures the dPasters

Phase also p erforms the distribution of the resulting values and intermedi

ate co de Particularly exp ensive is the replication of inline pro cedures and the

values of abstract structured constants and ob ject type templates since these

tend to b e large

10 80

8 60

6

40

Time (Sec) 4

20 2

0 0 S 2 4 6 8 10 S 2 4 6 8 10

Sites Sites

Figure Phase pro cessing times

Figures and break up the Phase timings into eval

uation and inlining These gures show that the main cost of the dPasters

Phase comes from the distribution of inline pro cedures not from the evalua

tion of expressions It is even the case that the dPaster running on sites or

s expressions more achieves a small sp eedup when evaluating P

In Section we analyzed the exp ected b ehavior of synchronized ver

Empirical Results and Analysis

1 2

1 Speedup Self Relative Speedup

0 0 S 2 4 6 8 10 0 2 4 6 8 10

Sites Sites

Figure Sp eedup of Phase

4

40 3

30

2 Time (Sec) 20

1

10

0 S 2 4 6 8 10 S 2 4 6 8 10

Sites Sites

Figure Phase expression evaluation times

Evaluation

2 3

2

1 Speedup 1 Self Relative Speedup

0 0 S 2 4 6 8 10 0 2 4 6 8 10

Sites Sites

Figure Phase expression evaluation sp eedup

sions of the expression evaluation and termination detection algorithms We

found that for b oth algorithms the number of rounds grows with the number

of slaves and with the height of the sup ergraph formed when internal paths

in the expression graph are collapsed This analysis is partially supp orted by

Figures and These graphs show that programs with high expres

require more termination detection rounds and sion graphs such as P

expression sends than programs of lower height such as the sPaster versions

However while the number of expression sends grows with the number of slaves

as predicted this do es not seem to b e true for termination detection rounds

Rather the number of such rounds decreases ever so slightly with the number of

slaves This is b ecause the predictions in Section relate to a synchronized

version of the expression evaluation algorithm while the one actually imple

mented is asynchronous In other words a slave do es not have to wait for the

other slaves to nish a round b efore it starts its next one Consequently when

there are many Slaves involved in the expression evaluation each Slave is likely

to receive a steady stream of incoming values to pro cess and will b e less likely

to have to enter a passive state

Analysis Phase 3

The previous section shows that the reason that the distributed Phase yields

a sp eedup b elow is that it has to do a fair amount of replication of global data

Empirical Results and Analysis

30 6

4 20 Time (Sec)

2 10

0 S 2 4 6 8 10 S 2 4 6 8 10

Sites Sites

Figure Phase inline expansion times This includes the time for up dat

ing the intermediate co de of inline pro cedures with the newly computed CET

values

But thanks to this replication Phase is freed of all interSlave communication

and makes up for the excessive cost of Phase by yielding for most programs

a sp eedup ranging from to see Figures and

Figure shows the relative cost of each of the sPasters Phase op era

tions Obviously co de generation takes up most of the sPasters time during

Phase This is particularly true for programs with a signicant amount of

Phase inlining since inline expansion creates large pro cedures for which co de

entry in Figure is exp ensive The data for the

generation Read Deferred

should b e taken with a grain of salt since zts uses memory mapping tech

niques for the reading of the ob ject co de les see Section the actual

reading of the deferred co de will take place when the co de is needed

There is still one ma jor problem with Phase and that is load balancing

This is evident from Figure left which shows that although Phase will

p erform quite well on the average it will on o ccasion p erform very p o orly The

problem is that the simple load balancing scheme which runs during Phase see

Evaluation

1 3

2 Speedup 1 Self Relative Speedup

0 0 S 2 4 6 8 10 0 2 4 6 8 10

Sites Sites

Figure Phase inline expansion sp eedup

6

10

5

8

4

6 Calls 3

4

2

2

1 0 2 4 6 8 10 0 2 4 6 8 10

Sites Sites

Figure Average number of termination detection rounds initiated by the

Master during Phase

Empirical Results and Analysis

15

30

10 20 Calls

5 10

2 4 6 8 10 2 4 6 8 10

Sites Sites

Figure Average number of expression sends p er Slave during Phase

Sections and cannot take into account the amount of inlining which

will b e p erformed during Phase since this is not known until after Phase

As a result the load balancing algorithm will sometimes unknowingly assign

several mo dules which are high in inlining to the same Slave and that Slave

will end its Phase pro cessing much later than the other Slaves This is clearly

shown in Figure right We are currently considering alternative load

balancing algorithms which allow mo dules to migrate from busy to uno ccupied

slaves

Analysis Phase 4

While the sPaster p erforms all its relo cation during Phase the dPaster p er

forms its relo cation concurrently with the Slaves Phase and pro cessing

Therefore all the dPasters Master has to do during Phase is wait for the

Servant to nish the relo cation The length of this wait dep ends on the total

amount of relo cation information when that information is pro duced and the

amount of pro cessing resources available to the Servant

Evaluation

80 500

400 60

300 40 Time (Sec)

200

20

100

0 S 2 4 6 8 10 S 2 4 6 8 10

Sites Sites

Figure Phase pro cessing times

6

6

4 4 Speedup 2 2 Self Relative Speedup

0 0 S 2 4 6 8 10 0 2 4 6 8 10

Sites Sites

Figure Sp eedup of Phase

Empirical Results and Analysis

100

80

60

40 Percent (%) of Phase 3 time

20

0 sPaster-A sPaster-B sPaster-C P129_16 R257_8

Open Files Read Deferred Evaluate Deferred

Expand Inline Generate Code Write Code

Figure Breakdown of the sPasters Phase activity Consult Algo

on page for a description of the various op erations The

rithm Evaluate

corresp onds to the op eration in Algorithm that lls in CET

Deferredentry

entry includes the time for recomputing basics blo cks The

values Generate Code

and nextuse information The includes the time for generating

Write Codeentry

binary co de from an internal symbolic machine co de format

Evaluation

80 80

60 60

40 Time (Sec) 40

20

20

0 S 2 4 6 8 10 S 2 4 6 8 10

Sites Sites

Figure The graph to the left shows the average shortest lowest p oint

of vertical line and longest highest p oint of vertical line Phase pro cessing

times over all the test runs The graph to the right shows the longest and

shortest pro cessing time of any individual Slave

For programs such as P with many deferred pro cedures and where

each pro cedure makes many calls the Slaves may pro duce more relo cation

information during Phase than the Servant can consume The net result

is that the Servant has to continue the relo cation after the Slaves are

nished with Phase

Similarly programs such as sPasterC which pro duce all their relo cation

information during Phase and need very little Phase pro cessing leave

the Servant with little time during Phase to p erform relo cation

If the Servant is allowed to execute on its own site as in StrategyA

and StrategyF it will have more resources available and will b e much

more likely to nish quickly than if it would share sites with the Master

or one of the Slaves

Empirical Results and Analysis

10 30

8

20 6

Time (Sec) 4 10

2

0 0 S 2 4 6 8 10 S 2 4 6 8 10

Sites Sites

Figure Phase pro cessing times

100

80 100

60

Speedup 40 50

20 Self Relative Speedup

0 0 S 2 4 6 8 10 0 2 4 6 8 10

Sites Sites

Figure Sp eedup of Phase

Evaluation

The last p oint is illustrated clearly in Figures and which show

little Phase sp eedup for StrategyB and StrategyD sites but

substantial sp eedup for StrategyA more than sites

Summary

The results of the measurements presented in this chapter indicate that for pro

grams with medium to large amounts of encapsulation and inlining the dPaster

achieves sp eedup factors b etween and given or more pro cessors Even

with as few as pro cessors available the dPaster is almost always faster than

the sPaster Furthermore when enough pro cessors are available the dPaster

will for many programs run faster than the SUN Mo dula linker ml This

is in spite of the fact that the SUN Mo dula translating system do es not sup

p ort ztsstyle exible encapsulation and intermodular optimization While

this may b e an unfair comparison it do es indicate that one can exp ect zts to

achieve turnaround times similar to traditional translating systems This sup

p orts the main claim of this thesis namely that distributed highlevel binding is

an attractive solution to the problems inherent in mo dularization and separate

compilation

In addition to the measurements of the sPaster and the dPaster this chapter

makes one imp ortant contribution a mo del of mo dular programs Although

this mo del is still in an embryonic stage without it we would have b een unable

to say anything ab out the b ehavior of the zts for many classes of programs

To b e a truly useful to ol for other types of applications however the mo del

needs to b e extended in many ways First of all there needs to b e much addi

tional research into the structure of real programs Not until such statistics are

available will we b e able to select reasonable values for the mo del parameters

Translating systems such as zts that are able to collect source statistics are a

step in this direction It will also b e necessary to provide b etter control over the

internal characteristics of mo dules and pro cedures A test data generating sys

tem such as Maurers based on contextfree grammars seems ideal for this

purp ose Finally we b elieve that the tuple value forms of Table are overly

simplied A more reasonable approach would make each form a function of

the mo dules height in the imp ort graph This would account for our intuition

that lower level mo dules often implement abstract data types and therefore

exp ort many abstract items particularly abstract types and abstract inline

pro cedures whereas mo dules at a higher level are more likely to b e abstract

state machines and therefore mostly exp ort noninline abstract pro cedures

Chapter

Conclusions and Future

Research

Q How many graduate students does it

take to change a light bulb

A Only one but it takes years

Reported by Jan Yarnot

n this thesis we have pursued two dierent avenues of research language

for b etter control over abstraction and encapsulation and trans

I design

lation system design for b etter ow of intermodular information in the

presence of separate compilation It should come as no surprise however that

the pursuit of these two interests has gone hand in hand the design of Zuse

has provided motivation for the design of the zts whose capabilities in turn

have inspired further additions to Zuse

Although it was the p ossibility of true orthogonal encapsulation that rst

sparked our interest in pursuing the research presented here we b elieve that

the spino results are of equal interest In fact one might not unreasonably

view the thesis from three dierent p ersp ectives that it is all ab out language

design all ab out co de optimization or all ab out ecient translation We will

consider each p ersp ective in turn

Language Design

Like others b efore us we have strived to design a language of least surprise

in our case with resp ect to encapsulation For example a b eginning Mo dula

programmer is surprised to learn that types may b e hidden but not constants

Conclusions and Future Research

and he is even more surprised when he is told that the realization of hidden types

is restricted to p ointers A b eginning Ada programmer is in for similar surprises

the realization of pro cedures may b e hidden but the realization of types and

constants may only b e protected The encapsulation system presented in this

thesis is simple and regular and should b e met without surprise by b eginning

programmers all asp ects of all exp orted items may b e hidden or revealed all

exp orted items may b e protected or unprotected

How do the Zuse encapsulation concepts compare to those found in other

languages We have already mentioned that encapsulation in Mo dula and

Ada is confusing and arbitrary and inuenced by implementation concerns more

than anything else Encapsulation in Mo dula is much more interesting and

regular although basically restricted to ob ject types Ob ject type encapsulation

in Zuse is very similar to the Mo dula design in fact Zuses ob jectoriented

features are more or less a subset of those found in Mo dula The main dif

ference is instead the way ob ject type encapsulation is implemented in zts and

SRC Mo dula We will discuss this more thoroughly in the next section

There is one imp ortant dierence b etween Zuse and Mo dula in regard

to ob ject type encapsulation namely the ability of Mo dula to present dif

ferent views of a particular ob ject type to dierent kinds of clients This is a

result of the Mo dula manytomany mo dule system and the Mo dula type

system which uses structural type equivalence We are currently investigat

ing the p ossibility of extending Zuse with a multiple view capability p ossibly

through a mo dule system more general than onetoone but more restricted

than Mo dulas manytomany which we nd overly exible and p otentially

confusing

The language most closely related to Zuse is MilanoPascal Section

which like Zuse supp orts hidden types and scalar constants There are many

dierences however First of all MilanoPascal unlike all mo dern mo du

lar languages uses imp ortbystructure rather than imp ortbyname From a

purely theoretical p oint of view imp ortbystructure app ears sup erior to imp ort

byname since it grants mo dules complete autonomy no mo dule needs to know

ab out the existence of any other mo dule Unfortunately imp ortbystructure is

completely unusable in practice since it requires each client to give a complete

description of all imp orted items it needs to use A further dierence b etween

Zuse and MilanoPascal is of course Zuses exible encapsulation which ex

tends the rudimentary facilities found in MilanoPascal with type protection

and various forms of semiabstract exp ort Furthermore MilanoPascal do es

not supp ort abstract inline pro cedures a concept which ought to b e an integral

part of any language concerned with ecient execution in the presence of strict encapsulation

Code Optimization

One question that is fair to ask at this p oint is whether the full exibility

aorded by Zuse in regard to encapsulation is really necessary Could it not

b e that the wide range of choices available to a programmer might b e more a

source of confusion than a source of elation It should b e clear that Zuse as

it is presented in this thesis is an experimental design intended to extend ex

ible encapsulation to its limits not a language engineered towards pro duction

programming We would not b e surprised if future research were to show that

a simpler bag of encapsulation primitives will b e sucient for most program

ming situations It must b e made clear however that from an implementation

p ersp ective full exible encapsulation as presented here comes at little extra

cost compared to a language with scaled down encapsulation consisting for

example of abstract types and constants

Co de Optimization

It might seem that the pursuit of new and more eective co de optimization

techniques and the invention of new programming language constructs would

b e two diametrically opp osed activities This is however not the case On

the contrary if new language constructs are to take ro ot and blossom it is

imp erative that their conception b e followed by the development of ecient

1

and eective translation techniques Many programmers are reluctant to use

concepts which they feel rightly or not would seriously hamp er the ecient

execution of their programs Therefore any attempt at introducing new en

capsulation concepts had b etter b e accompanied by ample evidence that these

concepts are indeed eciently implementable

We b elieve that this thesis has shown that full and exible encapsulation

is useful desirable and can b e implemented eciently Furthermore we have

shown that interprocedural optimization and data encapsulation are simply

dierent sides of the same coin and that techniques which supp ort the imple

mentation of one may b e used to implement the other In fact the evaluation

results of Chapter show that a translating system which p erforms bindingtime

intermodular pro cedure integration might as well supp ort full data encapsu

lation the additional bindingtime cost is negligible Pro cedure integration

is in itself a particularly imp ortant companion optimization to encapsulation

1

Unfortunately the availability of ecient implementation techniques is no guarantee that

new and otherwise interesting language constructs will b e immediately embraced by the pro

gramming language design community This is evident from the sad history of CLUs excellent

iterator construct whose introduction was immediately followed by the description of an e

cient implementation Atkinsson Still no other systems programming language has to

date included a similar concept

Conclusions and Future Research

since it nullies many of the ineciencies introduced by pro cedural interfaces

to abstractions

As a matter of fact the zts design guarantees that in spite of separate

compilation the co de pro duced for Zuse programs will b e as ecient as

that pro duced by compilers for monolithic languages Furthermore statically

allo cated abstract types intermodular inline expansion smart linking and

translationtime construction of ob ject type templates guarantee that zts will

pro duce better co de than traditional translating systems for mo dular and ob ject

oriented languages with strong encapsulation concepts

It is interesting to compare zts to the SRC Mo dula translating system

since as we noted in the previous section Zuse and Mo dula have essentially

the same encapsulation features in regard to ob jectoriented programming As

we describ ed in detail in Section the runtime system in the SRC Mo dula

implementation is resp onsible for p erforming many of the tasks p erformed at

bindingtime in zts The result is of course that other things b eing equal a

Zuse program translated by zts will always p erform b etter than the equivalent

Mo dula program translated by the SRC system

Translation Eciency

The introduction of a new programming language construct requires as we

have said pro of that programs which use the construct will execute eciently

Equally imp ortant however is to show that there exist translating system

techniques such that the construct may b e eciently pro cessed Programmers

may very well spurn an otherwise desirable programming language feature if its

use results in unacceptably long turnaround times We have devoted the second

half of this thesis to the presentation of techniques for the ecient pro cessing of

mo dular imp erative and ob jectoriented languages with exible encapsulation

concepts Our results show that distributed translation techniques pro duce

ecient highlevel mo dule binders which supp ort the concepts introduced in

Zuse

What is interesting however is that regardless of the translating techniques

used languages with strict encapsulation have an advantage with resp ect to

translation eciency over realization protection Section languages such

as Ada since they require less information to b e given in sp ecication units they

run a much smaller risk of suering from trickledown recompilations Thus

even a Zuse translating system which sp orts an unsophisticated sequential

mo dule binder and hence may suer long mo dule binding times may very well

do b etter on the whole than a traditional translating system for Ada since it

Translation Efficiency

is likely to suer fewer complete recompilations

Furthermore a Zuse translating system with a sophisticated mo dule binder

such as the dPaster will as we have seen in Chapter combine a reduced risk

of trickledown recompilations with mo dule binding times which are sometimes

shorter than what can b e exp ected from traditional linkeditors Zuse and Zts

are in essence able to combine the b est of several worlds exible encapsulation

excellent co de quality through intermodular optimization and static allo cation

and fast turnaroundtimes through distributed binding and fewer complete

recompilations

Sequential vs Distributed vs Incremental Translation

One of the implicit longterm goals of the research presented here is to compare

the b enets of sequential distributed and incremental translation techniques

We are esp ecially interested in the relative p erformance of these techniques

when combined with interprocedural optimization and when sub jected to mi

nor and ma jor source co de changes

The strength of incremental translation to ols lies in their sup erior p erfor

mance in the presence of small and lo cal changes to the source co de Incremen

tal translation techniques in fact rely on the assumption that lo cal source co de

changes will have lo cal eects on the ob ject co de This is often true for to ols

such as syntactic and semantic analyzers but may not hold true for aggres

sive optimizers and co de generators For example a co de generator which do es

interprocedural register allo cation may in the worst case need to repro cess the

entire program as a result of a small change to the b o dy of a single pro cedure

The same is true of the expansion of inline pro cedures a lo cal change to the

b o dy of an inline pro cedure will aect at least all pro cedures which call it

While distributed to ols are more immune to these kinds of worstcase scenarios

they will most likely b e outp erformed by their incremental counterparts in the

cases when lo cal changes only have lo cal eect

If we apply these reections to the translating system designs presented in

the thesis we might say that the dPaster will have go o d worstcase b ehavior

but p o or bestcase b ehavior while the iPaster will have go o d bestcase b ehavior

and p o or worstcase b ehavior In other words when only lo cal changes have

b een made to a program the best case the iPaster will p erform well but

the dPaster will p erform p o orly When changes with global eects have b een

p erformed the worst case however the opp osite will b e true It remains an

op en question whether it is p ossible to combine incremental and distributed

techniques to eciently handle situations with lo cal as well as global changes

Conclusions and Future Research

Contributions

The most imp ortant contribution of the rst half of this thesis is of course

the exible encapsulation primitives of Zuse We nd it interesting that it

has proven p ossible to blend type protection with abstract and semiabstract

2

exp ort to yield a coherent and regular type system A further interesting and

to us somewhat surprising discovery has b een the wide range of static semantic

conditions which cannot b e tested at compiletime

The most novel and in our mind the p otentially most protable contribution

of the second half of the thesis is the distributed mo dule binder of Chapter

We b elieve that distributed highlevel binding provides a framework not only

for the supp ort of data encapsulation in mo dular programming languages but

also for the application of intermodular optimizations which have hitherto b een

deemed prohibitively exp ensive We will pursue this avenue of research exam

ining how the dPaster can b e extended to supp ort various interprocedural op

timization techniques eg interprocedural constant propagation and register

allo cation as well as the ecient implementation of new language constructs

inline iteratorsgenerators and co de sharing among instances of generic mo d

ules We also intend to investigate dierent techniques for sp eeding up the

current dPaster implementation particularly new load balancing algorithms

and techniques for less rigorous separation into phases

Until now parallel and distributed translation has b een met with compara

tively little interest and the number of fully implemented systems is very small

We b elieve this to b e the result of triply misdirected eorts most practical as

well as theoretical work in the eld has b een geared towards tightly coupled

architectures which are not yet generally available the compilation of mono

lithic languages which are obsolete and the early phases of translation such

as lexical and syntactic analysis which have very ecient sequential solutions

Furthermore with the advent of separately compiled mo dular languages the

amount of parallel work available within a compilation unit is reduced to the

3

extent that the application of coarsegrained parallelism is not warranted We

b elieve that distributed highlevel mo dule binding op ens up interesting new

areas of application of distributed translation techniques since it is geared to

wards lo osely coupled distributed systems which are generally available mo d

ular languages which are in widespread use and the late phases of translation

which are exp ensive for sequential architectures

2

However much of the syntax leaves something to b e desired Type protection clauses for

example have b een describ ed as rather baro que

3

The work of Wortman and Junkin however shows that successful parallel

compilation of mo dular languages on shared memory architectures is p ossible

Bibliography

Ive got to think up bigger things

Il l bet I can you know

Il l speed my ThinkerUpper up

As fast as it wil l go

Then BLUNK Her ThinkerUpper thunked

A double klunkerklunk

My sisters eyes ew open

And she saw shed thunked a Glunk

He was greenish Not too cleanish

And he sort of had bad breath

Good gracious gasped my sister

I have thunked up quite a meth

Dr Seuss

Accredited Standards Committee X Information Pro cessing Systems The

American National Standards Institute ANSI CBEMA First St NW

Suite Washington DC Draft Proposed American National Standard

for Information Systems Programming Language C xj edition

September Cited on page

Ada X mapping mapping rationale ftp from a jp oseicmuedu August

Cited on page

Ada X pro ject rep ort Ada X requirements ftp from a jp oseicmuedu De

cember Cited on page

Ada X pro ject rep ort Ada supp ort for software reuse ftp from

a jp oseicmuedu Octob er Cited on page

Rolf Adams Annette Weinert and Walter Tichy Software change analysis or

Half of all Ada compilations are redundant In nd European Software Engi

258

BIBLIOGRAPHY

neering Conference pages Coventry UK September Cited on

page

Alfred V Aho Ravi Sethi and Jerey D Ullman Compilers Principles Tech

niques and Tools AddisonWesley ISBN Cited on pages

and

F E Allen J L Carter J Fabri J Ferrante W H Harrison P G Lo ewner

and L H Trevillyan The exp erimental compiling system IBM Journal of

Research and Development November Cited on page

Bowen Alp ern Roger Ho over Barry K Rosen and Peter F Sweeney F Kenneth

Zadeck Incremental evaluation of computational circuits In First Annual acm

siam Symposium on Discrete Algorithms pages San Francisco California

January Cited on page

Allen A Ambler and Charles G Ho ch A study of protection in programming

languages Technical Rep ort ICSCACMP University of Texas at Austin

Cited on page

Allen L Ambler Donald I Go o d James C Brown Wilhelm F Burger

Richard M Cohen Charles G Ho ch and Rob ert E Wells GYPSY A lan

guage for sp ecication and implementation of veriable programs SIGPLAN

Notices March ACM Conference on Language Design for Re

liable Software Cited on page

M Ancona L De Floriani D Do dero and P Thea Program development

by using a source linker In th Jerusalem Conference on Information technol

ogy JCIT Next Decade in Information Technology pages Jerusalem

Israel May Cited on page

James E Archer and Michael T Devlin Rationals exp erience using Ada for

very large systems In Proceedings of the First International Conference on Ada

Programming Language Applications for the NASA Space Station pages

Houston Texas USA June Cited on pages and

Russell R Atkinson Barbara H Liskov and Rob ert W Scheier Asp ects of

implementing CLU Proceedings ACM National Conference pages De

cember Cited on pages and

Erik H Baalb ergen Parallel and distributed compilations in lo oselycoupled

systems A case study In Proc Workshop on Large Grain Parallelism

Cited on pages and

Erik H Baalb ergen Design and implementation of parallel make Computing

Systems Cited on pages and

Erik H Baalb ergen Kees Verstoep and Andrew S Tanenbaum On the design

of the amo eba conguration manager ACM SIGSOFT Software Engineering

Notes November Pro c nd ACM International Workshop on Software

Conguration Management Cited on pages and

259

BIBLIOGRAPHY

Henri E Bal R van Renesse and Andrew S Tanenbaum Implementing dis

tributed algorithms using remote pro cedure calls In Proc National Computer

Conference AFIPS pages Cited on page

J Eugene Ball Predicting the eects of optimization on a pro cedure b o dy In

Proceedings of the Symposium on Compiler Construction pages

Cited on page

F L Bauer and H Wossner The plankalkul of Konrad Zuse A forerunner

of to days programming languages CACM July Cited on

page

Leland L Beck System Software An Introduction to Systems Programming

AddisonWesley second edition ISBN Cited on pages

and

Manuel E Benitez and Jack W Davidson A p ortable global optimizer and linker

In Proceedings of the SIGPLAN Conference on Programming Language Design

and Implementation pages Atlanta Georgia USA June Cited

on pages and

Marc Benveniste Op erational semantics of a distributed ob jectoriented language

and its Z formal sp ecication Publication Interne INRISAINRIARennes

Rennes Cedex France April Cited on page

R E Berry and B A E Meekings A style analysis of C programs CACM

January Cited on page

Dimitri P Bertsekas and John N Tsitsiklis Parallel and Distributed Computation

Numerical Methods PrenticeHall ISBN Cited on page

Andrew D Birrell and Bruce Jay Nelson Implementing remote pro cedure calls

ACM Transactions on Computer Systems February Cited on

pages and

Mary P Bivens and Mary Lou Soa Incremental register reallo cation Software

Practice and Experience Octob er Cited on page

Dines Bjorner and Cli B Jones Formal Specication Software Development

PrenticeHall ISBN Cited on page

Dines Bjorner and O N Oest Towards a Formal Description of Ada LNCS

Springer Verlag ISBN Cited on pages and

G unter Blaschek Gustav Pomberger and Alois Stritzinger A comparison of

ob jectoriented programming languages Structured Programming

Cited on page

HansJuergen Bo ehm and Willy Zwaenepo el Parallel attribute grammar eval

uation In R PopescuZeletin editor International Conference on Distributed

Computing Systems pages Berlin West Germany September

Cited on pages and

260

BIBLIOGRAPHY

Grady Bo o ch Software Engineering with Ada The BenjaminCummings Pub

lishing Company ISBN Cited on page

Grady Bo o ch Object Oriented Design with Applications The Ben

jaminCummings Publishing Company ISBN Cited on

page

Ellen Ariel Borison Program Changes and the Cost of Selective Recompilation

PhD thesis Carnegie Mellon University Cited on page

Gabriel Bracha and Sam Toueg Distributed deadlo ck detection Distributed

Computing Cited on page

Paul Branquart Georges Louis and Pierre Wodon An Analytical Description

of CHILL the CCITT High Level Language LNCS Springer Verlag

ISBN Cited on page

Gary Bray Sharing co de among instances of Ada generics SIGPLAN Notices

June ACM Sigplan Symp osium on Compiler Construc

tion Cited on page

British Standards Institution Third Working Draft Modula Standard

dn edition Octob er Cited on pages and

David Callahan Keith D Co op er Ken Kennedy and Linda Torczon Interpro

cedural constant propagation In Proceedings of the SIGPLAN Symposium

On Compiler Construction ACM June Cited on pages and

Frank W Calliss A comparison of mo dule constructs in programming languages

SIGPLAN Notices January Cited on pages and

Luca Cardelli James Donahue Lucille Glassman Mick Jordan Bill Kalsow and

Greg Nelson Mo dula language denition SIGPLAN Notices

August Cited on page

Luca Cardelli James Donahue Mick Jordan Bill Kalsow and Greg Nelson

Mo dula rep ort revised Technical Rep ort DEC SRC November

Cited on pages and

Luca Cardelli James Donahue Mick Jordan Bill Kalsow and Greg Nelson The

Mo dula type system In Sixteenth Annual ACM Symposium on Principles of

Programming Languages pages Austin Texas January Cited on

page

Luca Cardelli and Peter Wegener On understanding types data abstraction

and p olymorphism Computing Surveys December Cited

on pages and

Alan Carle Keith D Co op er Rob ert T Ho o d Ken Kennedy Linda Torczon

and Scott K Warren A practical environment for scientic programming IEEE

Software November Cited on pages and

261

BIBLIOGRAPHY

A Celentano P Della Vigna C Ghezzi and D Mandrioli Mo dularization

of blo ckstructured languages The case of Pascal In Workshop on Reliable

Software pages Bonn Germany September Cited on pages

and

A Celentano P Della Vigna C Ghezzi and D Mandrioli Separate compilation

and partial sp ecication in Pascal IEEE Transactions on Software Engineering

July Cited on pages and

Craig Chambers The Design and Implementation of the SELF Compiler

an Optimizing Compiler for ObjectOriented Programming Languages PhD

thesis Stanford University March Available via anonymous ftp from

selfstanfordedu Cited on pages and

F Chow M Himelstein E Killian and L Weber Engineering a RISC compiler

system In IEEE COMPCON pages Cited on pages

and

Fred C Chow Minimizing register usage p enalty at pro cedure calls In Proceed

ings of the SIGPLAN Conference on Programming Language Design and Im

plementation pages Atlanta Georgia USA June Cited on pages

and

Norman H Cohen Typeextension type tests can b e p erformed in constant

time ACM Transactions on Programming Languages and Systems

Octob er Cited on page

Rob ert F Cohen and Rob erto Tamassia Dynamic expression trees Technical

Rep ort CS Department of Computer Science Brown University December

Cited on page

Christian S Collb erg and Magnus G Kramp ell Design and implementation of

mo dular languages supp orting information hiding In Proceedings of the Sixth In

ternational Phoenix Conference on Computers and Communications pages

Scottsdale AZ USA February Cited on pages and

Christian S Collb erg and Magnus G Kramp ell A prop ertybased metho d for

selecting among multiple implementations of mo dules In Lecture Notes in Com

puter Science No AFCET Springer Verlag February Cited on page

Reidar Conradi and Dag Heiraas Wanvik Mechanisms and to ols for separate

compilation Technical Rep ort Division of Computer Systems and Telem

atics University of Trondheim Norway Presented at IFIP WG Maine

Oct Cited on pages and

Keith D Co op er Mary W Hall and Linda Torczon An exp eriment with inline

substitution Technical Rep ort TR Rice University Houston Texas USA

September Cited on pages and

Keith D Co op er Mary W Hall and Linda Torczon An exp eriment with in

line substitution SoftwarePractice and Experience June Cited on

pages and

262

BIBLIOGRAPHY

Keith D Co op er Ken Kennedy and Linda Torczon The impact of interpro

n

cedural analysis and optimization in the R programming environment ACM

Transactions on Programming Languages and Systems Octob er

Cited on pages and

Keith D Co op er Ken Kennedy and Linda Torczon Interprocedural optimiza

tion Eliminating unnecessary recompilations In Proceedings of the SIGPLAN

Symposium of Compiler Construction June Cited on pages and

James R Cordy and Richard C Holt Co de generation using an orthogonal

mo del SoftwarePractice and Experience March Cited on

page

Douglas Campb ell Coupland Generation X Abacus ISBN

Cited on page

Ron Cytron Jeanne Ferrante Barry K Rosen Mark K Wegman and F Ken

neth Zadeck An ecient metho d of computing static single assignment form In

th Annual ACM Symposium on Principles of Programming Languages pages

Cited on page

Susan A Dart Rob ert J Ellison Peter H Feiler and A Nico Hab erman Soft

ware development environments IEEE Software November

Cited on page

Jack W Davidson and Christopher W Fraser Co de selection through ob ject

co de optimization ACM Transactions on Programming Languages and Systems

Octob er Cited on page

Mark Day Personal communication Cited on page

Paul F Dietz Maintaining order in a linked list In Proceedings of the th An

nual ACM Symposium on Theory of Computing pages San Francisco

California May Cited on page

Edsger W Dijkstra Termination detection for diusing computation Informa

tion Processing Letters Cited on page

Antoni Diller Z An Introduction to Formal Methods John Wiley Sons Inc

ISBN X Cited on page

Anders Edenbrandt RMC a remote pro cedure call facility Unpublished

manuscript Department of Computer Science Lund University Cited

on page

R S Engelmore and A J Morgan Blackboard Systems chapter Addison

Wesley ISBN Cited on page

M C Er A parallel computation approach to top ological sorting The Computer

Journal Cited on page

263

BIBLIOGRAPHY

Peter H Feiler Susan A Dart and Grace Downey Evaluation of the Ratio

nal environment Technical Rep ort CMUSEITR Software Engineering

Institute July Cited on pages and

Michael B Feldman Data Structures with Modula PrenticeHall ISBN

Cited on page

Stuart I Feldman Make a program for maintaining computer programs

SoftwarePractice and Experience April Cited on page

Jeanne Ferrante Karl J Ottenstein and Jo e D Warren The program dep en

dence graph and its use in optimization ACM Transactions on Programming

Languages and Systems July Cited on page

Ray Ford and Mary Pfreundschuh Wagner Incremental concurrent builds for

mo dular systems Journal of Systems Software Cited on

page

David G Foster Separate compilation in a Mo dula compiler SoftwarePractice

and Experience February Cited on page

Nissim Francez Distributed termination ACM Transactions on Programming

Languages and Systems January Cited on page

Christopher W Fraser A language for writing co de generators In SIGPLAN

Conference on Programming Language Design and Implementation pages

Cited on page

Mahadevan Ganapathi and Charles N Fischer Attributed linear intermediate

representations for retargetable co de generators SoftwarePractice and Experi

ence Cited on page

David Gelernter Generative communication in Linda ACM Transactions on

Programming Languages and Systems January Cited on

page

Charles M Geschke James H Morris Jr and Edwin H Satterthwaite Early

exp erience with Mesa CACM August Cited on page

Carlo Ghezzi Personal communication Cited on page

Carlo Ghezzi and Mehdi Jazayeri Programming Language Concepts John Wiley

Sons Inc second edition ISBN X Cited on page

Alan Gibb ons and Wojciech Rytter An optimal parallel algorithm for dynamic

expression evaluation and its applications In Foundations of Software Technology

and Theoretical Computer Science Sixth Conference pages New Delhi

India December Springer Verlag LNCS Cited on page

Thomas Gross Angelika Zob el and Markus Zolg Parallel compilation for a

parallel machine In Proceedings of the SIGPLAN Conference on Program

ming Language Design and Implementation pages Portland Oregon June

Cited on page

264

BIBLIOGRAPHY

Ra jiv Gupta Lori Pollock and Mary Lou Soa Parallelizing data ow analy

sis In Proceedings of the Workshop on Parallel Compilation Kingston Ontario

Canada May Cited on page

J urg Gutknecht Separate compilation in Mo dula An approach to ecient

symbol les IEEE Software pages November Cited on pages

and

Mary Wolcott Hall Managing Interprocedural Optimization PhD thesis Rice

University Houston Texas USA April Cited on page

Samuel P Harbison Modula Prentice Hall ISBN Cited

on page

Rob ert Harp er Introduction to Standard ML Technical Rep ort ECSLFCS

Lab oratory for Foundations of Computer Science Department of Computer

Science University of Edinburgh November Cited on page

Rob ert Harp er David MacQueen and Robin Milner Standard ML Technical

Rep ort ECSLFCS Lab oratory for Foundations of Computer Science De

partment of Computer Science University of Edinburgh March Cited on

page

Gorel Hedin Incremental Semantic Analysis PhD thesis Department of Com

puter Sciences Lund University Lund Sweden March Cited on page

JeanMichel Helary Claude Jard Noel Plouzeau and Michel Raynal Detec

tion of stable prop erties in distributed applications In Proceedings of the th

Annual ACM Symposium on Principles of Distributed Computing pages

Vancouver British Columbia Canada August Cited on pages and

Mark Himelstein Fred C Chow and Kevin Enderby Crossmo dule optimiza

tions Its implementation and b enets In Proceedings of the Summer

USENIX Conference pages June Cited on pages

and

Daniel Homan Practical interface sp ecication SoftwarePractice and Experi

ence February Cited on pages and

Daniel Homan and Richard Sno dgrass Trace sp ecications Metho dology and

mo dels IEEE Transactions on Software Engineering Septem

b er Cited on page

Anne M Holler A Study of the Eects on Subprogram Inlining PhD thesis

University of Virginia Charlottesville Virginia USA March Computer

Science Rep ort No TR Cited on pages and

Richard C Holt Data descriptors A compiletime mo del of data and addressing

ACM Transactions on Programming Languages and Systems

Cited on page

265

BIBLIOGRAPHY

Richard C Holt and James R Cordy The Turing language rep ort Technical

Rep ort CSRI Computer Systems Research Institute University of Toronto

August Cited on page

Richard C Holt and James R Cordy The Turing plus rep ort Technical Rep ort

CSRI Computer Systems Research Institute University of Toronto August

Cited on page

Richard C Holt and Philip A Matthews The formal semantics of Turing pro

grams Technical Rep ort CSRI Computer Systems Research Institute Uni

versity of Toronto May ISSN Cited on page

Richard C Holt Philip A Matthews J Alan Rosselet and James R Cordy

The Turing Programming Language Design and Denition Prentice Hall

ISBN Cited on page

Richard C Holt and David B Wortman A mo del for implementing EUCLID

mo dules and prototypes ACM Transactions on Programming Languages and

Systems Cited on page

Michael Lee Horowitz Automatically Achieving Elasticity in the Implementation

of Programming Languages PhD thesis Carnegie Mellon University Cited

on pages and

ShingTsaan Huang Detecting termination of distributed computations by ex

ternal agents In International Conference on Distributed Computing Systems

pages Newp ort Beach California Cited on page

Jean D Ichbiah On the design of Ada In Information Processing pages

Elsevier Science Publishers BV North Holland Cited on page

D C Ince An Introduction to Discrete Mathematics and Formal System Speci

cation Oxford University Press ISBN Cited on page

M A Iqbal J H Saltz and S H Bokhari A comparative analysis of static

and dynamic load balancing strategies In Proceedings of the International

Conference on Parallel Processing pages Cited on page

Anita K Jones and Barbara H Liskov An access control facility for programming

languages Computation Structures Group Memo MIT April Cited

on page

Anita K Jones and Barbara H Liskov A language extension for controlling

access to shared data IEEE Transactions on Software Engineering

December Cited on pages and

Anita K Jones and Barbara H Liskov A language extension for expressing

constraints on data access CACM May Cited on page

Cli B Jones Systematic Software Development Using VDM Prentice Hall

second edition ISBN Cited on page xi

266

BIBLIOGRAPHY

Cli B Jones and Roger C F Shaw Case Studies in Systematic Software De

velopment Prentice Hall ISBN Cited on pages and

Kevin D Jones LM A Larch interface language for Mo dula a denition and

introduction version Technical Rep ort Digital Systems Research Center

June Cited on pages and

Simon L Peyton Jones Parallel Graph Reduction chapter PrenticeHall

ISBN Cited on page

Mick Jordan An extensible programming environment for Mo dula SIGSOFT

December Cited on page

Martin Jourdan A survey of parallel attribute evaluation metho ds In Attribute

Grammars Applications and Systems pages June LNCS

Cited on page

Michael D Junkin and David B Wortman The implementation of a concurrent

compiler Technical Rep ort CSRI Computer Systems Research Institute

University of Toronto December Cited on pages and

Gail E Kaiser Simon M Kaplan and Josephine Micallef Multiuser distributed

languagebased environments IEEE Software pages November

Cited on pages and

Simon M Kaplan and Gail E Kaiser Incremental attribute evaluation in dis

tributed languagebased environments In Proceedings of the th Annual ACM

Symposium on Principles of Distributed Computing pages New York

USA Cited on pages and

Howard P Katse Using data partitioning to implement a parallel assembler

SIGPLAN Notices September ACMSIGPLAN Parallel Pro

gramming Exp erience with Applications Languages and Systems Cited on

page

David Kepp el Susan J Eggers and Rob ert R Henry A case for run

time co de generation Technical Rep ort TR University of Washing

ton Seattle WA USA November Available via anonymous ftp

from cswashingtonedu in pubpardortcgcasepsZ Cited on

pages and

S Khanna and A Ghafo or A data partitioning technique for parallel compila

tion In Proceedings of the Workshop on Parallel Compilation Kingston Ontario

Canada May Cited on page

RB Kieburtz W Barabash and CR Hill A typechecking program linkage

system for Pascal In rd International Conference on Software Engineering

pages Atlanta GA USA May Cited on page

Eduard F Klein Attribute evaluation in parallel In Proceedings of the Work

shop on Parallel Compilation Kingston Ontario Canada May Cited on

page

267

BIBLIOGRAPHY

Eduard F Klein and Kai Koskimies Parallel onepass compilation In Inter

national Workshop on Attribute Grammars and their Applications WAGA

Paris France September LNCS Cited on page

Jrgen Lindskov Knudsen Ole Lehrmann Madsen Claus Nrgaard Lars Bak

Petersen and Elmer Sandvad An overview of the Mjlner BETA system Tech

nical rep ort Mjlner Informatics ApS Science Park Aarhus Gustav Wiedersvej

DK Arhus C Denmark March Cited on pages and

Donald E Knuth An empirical study of FORTRAN programs SoftwarePractice

and Experience Cited on page

Peter Kornerup Bent Bruun Kristensen and Ole Lehrmann Madsen Interpre

tation and co de generation based on intermediate languages SoftwarePractice

and Experience Cited on page

Magnus Kramp ell Information Hiding Design and implementation of mo du

lar programming languages Licentiate Thesis LUNFDNFCS

Department of Computer Sciences Lund University Cited on page v

Antoni Kreczmar Andrzej Salwicki and Marek Warpechowski LOGLAN

Report on the Programming Language LNCS Springer Verlag Cited

on page

Bent Bruun Kristensen Ole Lehrmann Madsen Birger MollerPedersen and

Kristen Nygaard Ob ject oriented programming in the Beta programming lan

guage Technical Rep ort Datalogisk Afdeling Aarhus Universitet January

Cited on page

Matthijs F Kuip er Parallel Attribute Evaluation PhD thesis University of

Utrecht Padualaan PO Box Utrecht The Netherlands November

Cited on page

Matthijs F Kuip er and S Doaitse Swierstra Parallel attribute evaluation Struc

ture of evaluators and detection of parallelism In International Workshop on

Attribute Grammars and their Applications WAGA pages Paris

France September LNCS Cited on page

B W Lampson JJ Horning RL London J G Mitchell and GJ Popek

Rep ort on the programming language Euclid SIGPLAN Notices Feb

ruary Cited on pages and

Butler W Lampson Hints for computer system design IEEE Software

January Cited on page

David B Leblang and Rob ert P Chase Jr Parallel software conguration man

agement in a network environment IEEE Software pages November

Cited on page

Eliezer Levy and Abraham Silb erschatz Distributed le systems Concepts and

examples Computing Surveys December Cited on page

268

BIBLIOGRAPHY

Mark A Linton and Russel W Quong A macroscopic prole of program compi

lation and linking IEEE Transactions on Software Engineering

April Cited on pages and

Barbara H Liskov Russell R Atkinson Toby Blo om Eliot Moss J Craig Schaf

fert Rob ert Scheier and Alan Snyder CLU Reference Manual LNCS

Springer Verlag ISBN X Cited on pages and

Barbara H Liskov and John Guttag Abstraction and Specication in Program

Development The MIT Press ISBN Cited on page

Barbara H Liskov A Snyder and Russell R Atkinson Abstraction mechanisms

in CLU CACM August Cited on pages and

M D Maclaren Inline routines in VAXLN Pascal SIGPLAN Notices

June ACM Sigplan Symp osium on Compiler Construction Cited

on page

Friedermann Mattern Exp erience with a new distributed termination detection

algorithm In nd International Workshop on Distributed Algorithms pages

Amsterdam The Netherlands July LNCS Cited on page

Peter M Maurer Generating test data with enhanced contextfree grammars

IEEE Software pages July Cited on page

Scott McFarling Pro cedure merging with instruction caches In ACM SIGPLAN

Conference on Programming Language Design and Implementation pages

Toronto Canada June Cited on page

Brendan D McKay On the shap e of a random acyclic digraph Mathematical

Proceedings of the Cambridge Philosophical Society Cited

on pages and

Kurt Mehlhorn Data Structures and Algorithms Graph Algorithms and NP

Completeness Springer Verlag ISBN X Cited on page

Wen mei W Hwu and Pohua P Chang Inline function expansion for compiling

C programs In Proceedings of the SIGPLAN Conference on Programming

Language Design and Implementation pages Portland Oregon June

Cited on page

Bertrand Meyer Genericity versus inheritance In OOPSLA pages

September Cited on page

Bertrand Meyer Object Oriented Software Construction AddisonWesley

ISBN Cited on pages and

Bertrand Meyer From structured programming to ob jectoriented design The

road to Eiel Structured Programming Cited on page

Bertrand Meyer Introduction to the Theory of Programming Languages

AddisonWesley ISBN Cited on pages and

269

BIBLIOGRAPHY

Bertrand Meyer Eiel The Language PrenticeHall ISBN

Cited on page

Mjlner Informatics ApS Science Park Aarhus Gustav Wiedersvej DK

Arhus C Denmark The Mjlner BETA Compiler Reference Manual mia

edition September Cited on page

Mjlner Informatics ApS Science Park Aarhus Gustav Wiedersvej DK

Arhus C Denmark The Mjlner BETA Fragment System Reference Manual

mia edition Octob er Cited on page

Hansp eter Mossenbock and Josef Templ Ob ject Ob eron a mo dest ob ject

oriented language Structured Programming Cited on

page

Hansp eter Mossenbock and Niklaus Wirth Dierences b etween Ob eron and

Ob eron Technical rep ort Institute f ur Computersysteme ETH Z urich

Cited on page

Hansp eter Mossenbock and Niklaus Wirth The programming language Ob eron

Technical rep ort Institute f ur Computersysteme ETH Z urich Cited

on page

Greg Nelson Systems Programming with Modula Prentice Hall ISBN

Cited on pages and

David Notkin The Gandalf pro ject The Journal of Systems and Software

Cited on pages and

Stephen M Omohundro The Sather language Technical rep ort International

Computer Science Institute Center Street Suite Berkely California

Cited on page

Jukka Paakki Anssi Karhinen and Tomi Silander Orthogonal type extensions

and reductions SIGPLAN Notices July Cited on pages

and

M Paganini A Sacco L Tartaglino and A Tazzari A linker for a mo dularized

1

Pascal Technical rep ort Dipartimento di Elettronica Politecnico di Milano

in Italian Cited on pages and

David L Parnas Information distribution asp ects of design metho dology In

Information Processing pages North Holland Publishing Company

Cited on page

David L Parnas A technique for software mo dule sp ecication with examples

CACM May Cited on page

Rob ert M Pirsig Zen and the Art of Motorcycle Maintenance The Bo dley Head

Ltd ISBN Cited on page

1

This work has not b een available for dissemination

270

BIBLIOGRAPHY

Lori L Pollock and Mary Lou Soa Incremental global optimization for faster

recompilation In IEEE International Conference on Computer Languages

pages New Orleans Louisiana Cited on pages and

Lori L Pollock and Mary Lou Soa Incremental global reoptimization of pro

grams ACM Transactions on Programming Languages and Systems

April Cited on page

Leon Presser and John R White Linkers and loaders Computing Surveys

September Cited on page

Russel W Quong The Design and Implementation of an Incremental Linker

PhD thesis Stanford University May Cited on page

Russel W Quong Linking programs incrementally ACM Transactions on Pro

gramming Languages and Systems January Cited on pages

and

Eric S Raymond editor The New Hackers Dictionary The MIT Press

ISBN Cited on page

Stephen A Rees and James P Black An exp erimental investigation of dis

tributed matrix multiplication techniques SoftwarePractice and Experience

Octob er Cited on page

Thomas Reps and Tim Teitelbaum Language pro cessing in program editors

IEEE Software November Cited on page

Stephen Richardson and Mahadevan Ganapathi Co de optimization across pro

cedures Computer pages February Cited on pages and

Stephen Richardson and Mahadevan Ganapathi Interprocedural analysis vs

pro cedure integration Information Processing Letters pages August

Cited on page

Stephen Richardson and Mahadevan Ganapathi Interprocedural optimization

Exp erimental results SoftwarePractice and Experience February

Cited on page

M W Rogers editor Ada Language Compilers and Bibliography Cambridge

ISBN Cited on pages and

Jonathan Rosenberg Generating Compact Code for Generic Subprograms PhD

thesis CarnegieMellon University Cited on page

Donald L Ross Classifying Ada packages SIGADA Letters July

Cited on pages and

Paul Rovner Extending Mo dula to build large integrated systems IEEE

Software pages November Cited on page

A H J Sale Optimization across mo dule b oundaries The Australian Computer

Journal August Cited on pages and

271

BIBLIOGRAPHY

A H J Sale Personal communication Cited on page

Donald Sannella Formal sp ecication of ML programs Technical Rep ort ECS

LFCS Lab oratory for Foundations of Computer Science Department of

Computer Science University of Edinburgh November Cited on page

Vats Santhanan and Daryl Odnert Register allo cation across pro cedure and

mo dule b oundaries In SIGPLAN Conference on Programming Language

Design and Implementation pages White Plains New York June

Cited on pages and

M Satyanarayanan A survey of distributed le systems Annual Reviews of

Computer Science Cited on page

Rob ert W Scheier The analysis of inline substitution for a structured pro

gramming language CACM September Cited on pages

and

David A Schmidt Denotational Semantics A Methodology for Language De

velopment Wm C Brown Publishers ISBN Cited on

page

Rob ert W Schwanke and Gail E Kaiser Smarter recompilation ACM Trans

actions on Programming Languages and Systems Octob er

Cited on pages and

Mayer D Schwartz Norman M Delisle and Vimal S Begwani Incremental

compilation in Magpie SIGPLAN Notices June ACM

Sigplan Symp osium on Compiler Construction Cited on page

Rob ert W Seb esta Concepts of Programming Languages The Benjamin Cum

mings Publishing Company Inc ISBN Cited on page

Dr Seuss I can Lick Tigers Today And Other Stories Collins ISBN

Cited on page

Mary Shaw editor ALPHARD Form and Content Springer Verlag ISBN

Cited on pages and

Mary Shaw William A Wulf and Ralph L London Abstraction and verication

in Alphard Dening and sp ecifying iteration and generators CACM pages

Cited on page

Mukesh Singhal Deadlo ck detection in distributed systems COMPUTER pages

Cited on page

C H Smedema P Medema and M Boasson The Programming Languages

Pascal Modula CHILL Ada PrenticeHall ISBN Cited

on page

272

BIBLIOGRAPHY

Alan Snyder Inheritance and the development of encapsulated software comp o

nents In Bruce Shriver and Peter Wegner editors Research directions in object

oriented programming pages MIT Press ISBN

Cited on page

J M Spivey The fuzz Manual Cited on page

J M Spivey Understanding Z Cambridge Tracts in Theoretical Computer

Science Cambridge University Press ISBN Cited on

pages and

J M Spivey The Z Notation A Reference Manual AddisonWesley

ISBN X Cited on page

Richard M Stallman Using and Porting GNU CC Free Software Foundation

Inc edition January Cited on page

Bjarne Stroustrup Personal communication rep orted in Bo o ch Cited

on page

Richard E Sweet The Mesa programming environment In SIGPLAN Sym

posium on Language Issues in Programming Environments pages Seat

tle Washington June ACM Cited on pages and

R B Tan and J van Leeuwen General symmetric distributed termination de

tection Technical Rep ort RUUCS Vakgroep Informatica Rijksuniveriteit

Utrecht January Cited on page

LarsErik Thorelli A linker allowing hierarchic comp osition of programs In

Information Processing IFIP Elsevier Science Publishers BV Cited

on page

LarsErik Thorelli A language for linking mo dules into systems BIT

Cited on page

LarsErik Thorelli Mo dules and type checking in PLLL In OOPSLA pages

Octob er Cited on page

Walter F Tichy Smart recompilation ACM Transactions on Programming

Languages and Systems July Cited on pages and

Mads Tofte Four lectures on Standard ML Technical Rep ort ECSLFCS

Lab oratory for Foundations of Computer Science March Cited on page

S Vankatesan Reliable proto cols for distributed termination detection IEEE

Transactions on Reliability April Cited on page

Jerey Scott Vitter and Philipp e Fla jolet AverageCase Analysis of Algorithms

and Data Structures chapter Elsevier Science Publishers BV ISBN

Cited on page

Kim Walden Automatic generation of Make dep endencies SoftwarePractice

and Experience June Cited on page

273

BIBLIOGRAPHY

David W Wall Global register allo cation at link time In Proceedings of the

SIGPLAN Symposium on Compiler Construction pages New York

jul Cited on pages and

David W Wall Register windows vs register allo cation In Proceedings of the

SIGPLAN Conference on Programming Language Design and Implementa

tion pages Atlanta Georgia USA June Cited on page

David W Wall Global register allo cation at link time Research Rep ort Dig

ital Western Research Lab oratory Hamilton Avenue Palo Alto California

USA September Cited on pages and

David W Wall Exp erience with a softwaredened machine architecture ACM

Transactions on Programming Languages and Systems July

Cited on pages and

David W Wall and Michael L Powell The Mahler exp erience Using an interme

diate language as the machine description Research Rep ort Digital Western

Research Lab oratory Hamilton Avenue Palo Alto California USA

September Cited on pages and

Mark N Wegman and Kenneth Zadeck Constant propagation with condi

tional branches ACM Transactions on Programming Languages and Systems

April Cited on page

Peter Wegner Learning the language BYTE pages March Cited

on page

Alan L Wendt Fast co de generation using automaticallygenerated decision

trees In SIGPLAN Conference on Programming Language Design and Im

plementation pages White Plains New York June Cited on page

Thomas R Wilcox and Howard J Larsen The interactive and incremental

compilation of Ada using Diana Technical rep ort Rational Mountain View

California USA Cited on pages and

Niklaus Wirth Programming in Modula Springer Verlag second edition

Cited on page

Niklaus Wirth From Mo dula to Ob eron SoftwarePractice and Experience

July Cited on page

Niklaus Wirth The programming language Ob eron SoftwarePractice and Ex

perience July Cited on pages and

Niklaus Wirth Type extensions ACM Transactions on Programming Languages

and Systems April Cited on pages and

Niklaus Wirth The programming language Ob eron Technical Rep ort ETH

September Cited on page

Niklaus Wirth and J urg Gutknecht The Ob eron system SoftwarePractice and

Experience September Cited on pages and

274

BIBLIOGRAPHY

Alexander L Wolf Lori A Clarke and Jack C Wileden A mo del of visibility

control IEEE Transactions on Software Engineering April

Cited on pages and

Jim Woo dco ck and Martin Lo omes Software Engineering Mathematics Pitman

ISBN Cited on page

David B Wortman A concurrent Mo dula compiler In Proceedings of the

Workshop on Parallel Compilation Kingston Ontario Canada May Cited

on pages and

David B Wortman and James R Cordy Early exp eriences with Euclid In

th International Conference on Software Engineering pages San Diego

California USA March Cited on page

David B Wortman and Michael D Junkin A concurrent compiler for Mo dula

In SIGPLAN Conference on Programming Language Design and Im

plementation pages Cited on pages and

W A Wulf Abstract data types A retrosp ective and prosp ective view In

Proceedings of the th Symposium on the Mathematical Foundations of Computer

Science pages Rydzyna Poland September Springer Verlag Cited

on page

Xerox Corp oration Hillside Avenue Palo Alto CA Mesa Language

Manual xde edition November Cited on pages and

Daniel M Yellin Sp eeding up dynamic dynamic transitive closure for b ounded

degree graphs Technical Rep ort RC IBM T J Watson Research Center

Octob er Cited on page

Daniel M Yellin and Rob ert E Strom INC A language for incremental compu

tations ACM Transactions on Programming Languages and Systems

April Cited on page

Ignacio Trejos Zelaya An exp erimental language denition Masters thesis

Oxford University Keble Road Oxford OX QD United Kingdom Sep

tember Cited on pages and

Konrad Zuse Der plankalkul In Berichte der Gesel lschaft F ur Matematik und

Datenverarbeitung volume Cited on pages and

Index

The best index to a persons character is

a how he treats people who cant do

him any good and

b how he treats people who cant

ght back

Abigail Van Buren

Neither abstraction nor simplicity is a

substitute for getting it right

Butler Lampson

Abstract new

private

constant

Aggregation

Alb ericht

exp ort

Alphard

initialization

Amo eba

pro cedure

Anna

Assignment semantics

type

Attribute grammars

Abstract data type

Beta

Abstract state machine

Bindingtime supp ort

Abstraction

control

Broadcast

data

Activation record

C

Ada

CMesa

generic package

Callgraph

inline pro cedure

language

Rational environment

Case statement

static allo cation

CET

type

deferred access

276

INDEX

Executable le attribute

op erand

op erator

Extensible declaration

CHILL

Classbased language

Extension declaration

CLU

Co de generation

File caching

File server

For statement

Fragments in Beta

runtime

Functional decomp osition

Co de sharing of generic mo dules

Gandalf

Concrete

Garbage collection

constant

Generic mo dule

exp ort

initialization

global CET attribute

pro cedure

type

CSP

Hidden

constant

Data decomp osition

inline pro cedure

ob ject type

Deadco de elimination

p ointer type

Deadlo ck

record eld

Default value

subrange type

Deferred

type

context condition

enumerable type

pro cedure iii

Hierarchical binding

Immutable declaration

Denotational semantics

Depthrst evaluation

Implementation ordering

Distributed compilation

Imp ortbystructure

Distributed termination

Impure format

Dynamic allo cation

Inclink

Dynamic linking

Indep endent compilation

Information hiding

Eiel

Inheritance

Encapsulation iii

applications of

Equivalence types

implementation

Ethernet

in Eiel

Euclid in Mo dula

277

INDEX

in Zuse MilanoPascal

multiple

sp ecication MIPS compiler

Initialization modied CET attribute

Inline pro cedure

Mo dula

in Ada Mo dula

ininline expansion Mo dula

Interprocedural

constant propagation

optimization SRC implementation

Interface Mo dule migration

Intermediate co de Mo dule system

Iterator manytomany

Ivory onetoatleastone

onetomany

Larch

onetoone

Lightweight pro cess

zerotoone

Linda

Multicast

Link editor

incremental

Nextuse information

prelinker

Linktime

Ob eron

co de generation

Ob eron

optimization

Ob jectbased language

logical

Ob jectOb eron

Load balancing

Ob jectoriented language

Loader

Lo oselycoupled

Opaque type

ml

Mesa

Orthogonality iii

Overloaded imp ort

Metho d template Partial revelation

PIC

PLLL

278

INDEX

Polymorphism Subtype test

Pre and p ostconditions subtypes CET attribute

Pro cess allo cation supertype CET attribute

Public pro jection type Symbol le

Synthesizer Generator

Qualied imp ort

Timestamp checks

timestamp CET attribute

Rational Ada environment

Trickledown recompilation

Realization protection

Turing

Realization restriction

Type co ercion

Type protection clause

Realization revelation

Recursive declarations

Typeco de

Register allo cation

UDPIP

Uncached le

Relo cation

Unique type

Unix

Unqualied imp ort

Usagecount

RPC proto col

VDM

Virtual metho d

RTL register transfer lists

Runtime supp ort

Workstationserver network

SELF

Writethrough le

Selfrelative sp eedup

Semiabstract

Z

constant

Zco de

exp ort iii

initialization

attribute

pro cedure

instruction

Zuse Konrad

type

Slop allo cation

Smart linking

Smart recompilation

Software contract

Sp ecialization

Standard ML

Static allo cation