Design and Implementation

of a

Partial EvaluationBased Compiler

for an

Asynchronous Realtime

Markus Freericks

May

Abstract

This thesis describ es a compiler for the asynchronous realtime programming language ALDiSP Though

the language has a complex semantics not suited for easy compilation the compiler has to generate co de

for target platforms that have stringent space limitations and for target applications that have to satisfy

hard realtime requirements To accomplish this feat the compiler is based up on an abstract interpreter that

simulates all p ossible evaluation paths of the program In a reconstruction phase the program is then totally

recreated from the information gained during this simulation The abstract interpreter is an extension of

the formal semantics of ALDiSP

Diese Dissertation b eschreibt einen Compiler fur die asynchrone EchtzeitProgrammiersprache ALDiSP Ob

wohl diese Sprache eine komplexe Semantik hat die eine direkte Ub ersetzung ausschliet soll der Compiler

Co de fur Zielarchitekturen erzeugen die starken Sp eicherplatzRestriktionen unterliegen und fur Applikatio

nen die harte Echtzeitanforderungen erfullen mussen Um dieses Ziel zu erreichen enthalt der Compiler als

zentrale Komp onente einen abstrakten Interpreterdersamtliche moglichen Ausfuhrungspfade eines Quell

programms analysiert In einer Rekonstruktionsphase wird das Programm aus den vom Interpreter erzeugten

Annotationen vollstandig neu aufbaut Der abstrakte Interpreter ist eine Erweiterung der formalen Semantik

von ALDiSP

Contents

Intro duction

RealTime Digital Signal Pro cessing

Application Areas for ALDiSP

Overview of Compiler and Thesis

Goals and Results

Related Work Languages

Related Work Partial Evaluation

Related Work Partial Evaluation for Numerics and DSP

Stylistic Questions Thanks c

ALDiSP A Primer

Lexical Structure

Overall Structure

Evaluation Strategy

Basic Typ es

AutoMapping

Susp ensions

Blo cking Access

Sequences

Multiple Return Values

Conditionals

Lo cal Denitions

Chec king and Casting Typ es

Exceptions

Lists Streams Pip es

Declarations

Mo dules

Functions

CONTENTS

The Parser

Overview

Lexical Analysis

Prepro cessing

The Parser

Blind Alleys

Representation of Literals

Implementing The Preprocessor

Error Hand ling

Line Numbers and File Names

CPS Intermediate Form

CPS Transformation The Morale

The Lambda Form

Semantic Domains

Conventions and Notation

Data Domains

Results

Evaluation Functions an Overview

eval

expr

apply

eval

decl

eval

program

State

Auxiliary Semantic Functions

eval

exprs

deref

ob js

block

strict strict

exprs

eval norm

thunk result

Evaluating Expressions

Lambda ClosureCreation

LvarVariable LookUp

Lit Literal Values

LappFunction Application

Lcheck Asserting Types

Lcast Casting Types

Lcond TwoWay Conditional

Lselect NWay Conditional

LseqSequences of Expressions

Lcatch Hand ling Exceptions

LetLocal Denitions

Evaluating Declarations

Ldecl The Simple Declarations

Lpard Paral lel Declarations

LseqdSequential Declarations

LfixdRecursive Declarations

Application Rules

apply

closure

apply

overloaded

apply

primitive

apply

array

apply

typ e

AutoMapping

apply

mappable

TopLevel Evaluation

eval

program

schedule

Discussion of Selected Problems

Macros

AutoMapping

Blocking

Implementing Recursion

CONTENTS

Transformation of the AST into Lambda Form

Transforming Programs T

program

Transforming Declarations T

decl

Asimpledecl

Aparamdecl

Amultidecl

Amoduledecl

Aimportdecl

Aabstypedecl

Parameters

An Example Transformation

The Rules

Arecdecl

Transforming Typ e Expressions T

typ e

Afunctype

Aexprtype

Transforming Expressions T

expr

Avar

Aapp Acond Acheck

Acast

Adelay

Asuspend

Atuple

Aseq

Astring Aint Afloat Aconst

Aguard

Areturn

Alocal

Postpro cessing Passes

Implementation of postprocessing steps

Expanding Macros

Declaration Simplication

Removing unnecessary Checks and Casts

Conversion

Problems with Rearranging Expressions

da Hoisting Lamb

Transforming FreeVariables into Parameters

Strictness Analysis

Partial Evaluation Denitions and Known Results

Denitions

Partial Evaluation Residual Program

SelfApplication and the Futamura Pro jections

Internal Sp ecialization

OnlineOline Partial Evaluation

Static and Dynamic Binding Times

Partially Static Values

Sp ecialization

Generalization

FoldingUnfolding

Fixed Points

Applying the Techniques of Abstract Interpretation to ALDiSP

CONTENTS

Abstract Interpretation of LF Expressions

Goals of Abstract Interpretation

Abstract Values

General Remarks

Domains

Basic Domain Constructions

Basic Examples

Union of Domains

The Ideal Abstract Domain and its Subsets

Size of Domains

Sources of Abstract Values

An Example

Abstract Conditionals

Loop Detection Finding Fixed Points

Speed

Generalizing the Argument Lists

Conclusion of the Example

Implementing the Lo op Detector

Na ve Loop Detection Schemes

Problems with the Nave Scheme

Not Enough Generalization

Too much Generalization

Mutual Recursion

Multivariant Specialization

Restarting Approximations

Generalizing Loop Detection

Concrete Arguments

EvaluationOrder Eects

ALDiSPsp ecic Problems

State and Side Eects

Promises

Recursive Denitions

Exceptions

Function Values

Generalizing Numeric Types

Raising and Catching Exceptions

The Abstract Scheduler

Susp ensions and other Schedulable Entities

NonDeterminism

Sequentializing the Schedule

Garbage Collection

Finding Lo ops

Alternative States

Blo cking

CONTENTS

Execution Trace Reconstruction

The Co de Form

Dierences between CF and LF

Abstract Syntax of CF

Notation

Closures Tuples Combinators

TopLevel Bindings

Semantics of the Code Form

Programs

Atoms

Primitives

Conditionals

Function Application

Throwing and Catching Exceptions

Evaluating Declarations

Evaluating Code expressions

Relation b etween Abstract Results and Co de Attributes

The Tr eatment of State

atomaCode Attribute for Obj s

Example StraightLine Co de

InterFunction Optimizations

Inlining

Grouping

Tabulation and Caching

Elimination of Parameters and Return Values

Variable Splitting

Basic Optimizations

Variable Propagation

DeadCode Removal

ReferenceTracing

Tuple Tracing

Example Blo cking Co de

Whats in a Block

Alternatives to Blocks CPS

Alternatives to Blo cks Syntax Transformations

The Blocking Example

Sp ecication of Co de Generation

eval

exprs

deref

ob js

block

strict and strict

exprs

eval and norm

thunk result

Lit

Lexical Lvar s

Dynamic Lvar s

Lambda

Lcheck

Lcast

Lcond

Lselect

Lseq

apply

apply

typ e

apply

primitive

apply

closure

Related Work

CONTENTS

The BackEnd

Dierences b etween M and CF

Disjoint Address Space Index Types and HardwareLoops

Data Formats Special Instructions and the ALDiSP Library

No Implicit State

Semantic Entities of M

Function Denitions

Structure of M

Transforming CF into M

The C backend

The Future Automatic BackEnd Generation

Conclusions and Outlo ok

DevelopmentHistory

Design Alternatives

Explicit State Passing

ContinuationPassing Style

Compilation by Graph Reduction

Redesigning ALDiSP

A Literature

B Index

Chapter

Intro duction

Whenever anyone says theoretically they really mean not really

DaveParnas

This work describ es the overall design and the internal structure of ac the rst compiler for the ALDiSP

programming language ALDiSP was develop ed by the author and Alois Knoll in at TU Berlin

within the context of the CADiSP pro ject The goal was to create an interactive visually oriented

system for the development of digital signal pro cessing DSP algorithms based on a dataow paradigm

In such a system the user describ es the algorithms by connecting parameterizable blackboxes These

basic comp onents were to b e describ ed in ALDiSP which in turn should b e compiled into the imp erative

language ImDiSP whichnow has a life and a compiler of its own

It so on b ecame clear however that the issues concerning the compilation of b oth applicative and imp erative

languages into high quality co de for realtime applications were much more interesting than the original

pro ject goal This also applies to the adaptation of those languages b oth to the requirements of realtime

pro cessing and ecient compilation For this reason the whole pro ject was retargetted to the development

of ALDiSP and ImDiSP

RealTime Digital Signal Pro cessing

Digital signal pro cessing DSP is concerned with the generation transformation and analysis of physically

interpreted signals eg in audio applications Typical DSP algorithms are

lters with constant coecients with nite or innite resp onse b ehaviour FIR and I IR

adaptive lters used to cop e with timevarying input signal characteristics

signal transforms esp ecially the family of fast Fourier transformations FFT algorithms whichare

used to convert signals into the frequencyphase domain and back

correlation functions with tabulated sample signals and

detection and estimation of statistical signalssp ectra

These algorithms are usually dened as functions of equitemp orally sampled signals Their overall structure

tends to b e synchronous and quite simple Algorithms working with asynchronous signals or with signals of

dierentorvarying sample rates are usually a lot more complex to describ e implement and analyze

Chapter Introduction

Another imp ortant application eld of DSP techniques is image pro cessing computer vision CV ie the

analysis and transformation of video signals CV algorithms usually need so much computing p ower and

memory bandwidth that realtime applications must b e implemented using dedicated hardware

ALDiSP was designed as a language esp ecially suited for the functional sp ecication of DSP applications

ALDiSP is describ ed in full detail in and less detailed in Chapter will explain the essentials

necessary for understanding the sp ecial problems encountered in compiling this language

ALDiSP provides all the constructs necessary for realtime programming RTP In particular IO transac

tions and time constraints can b e sp ecied We consider the domain of realtime DSP the most demanding

case of RTP not only do es it have all the problems of general DSP numb er crunching on minimal sp ecialized

target hardware platforms it also has to cop e with severe limits on timing

In many resp ects ALDiSP is an op enlanguage as dened there is a basic k ernelthat has to b e pro vided by

any implementation a lot of additional data typ es eg complex numb ers and higherdimensional matrices

and functions are dened but do not havetobeprovided byevery implementation since they maynotbe

needed in some of the p ossible application elds

Conversely each implementation will havetoprovide extra functions and ob jects not dened by the standard

namely those necessary for realtime IO Because it is imp ossible to dene a uniform IO interface that

is optimally suited for all p ossible applications of ALDiSP only the basic entities synchronous and asyn

chronous IO ob jects and their b ehaviour are describ ed in the standard everything else is implementation

dep endent

Finallyitmust b e emphasized that not every ALDiSP program can b e compiled for every target architecture

As the language allows for the sp ecication of realtime b ehaviour it is p ossible to write programs that need

more computation p ow er than can b e provided byagiven machine

In theoryanALDiSP compiler should verify that compiled programs conform to their timing sp ecication

in practice this is not done Computing the timing b ehaviour needs a very precise machine mo del and can

only b e done for fully synchronous programs For most practical programs a simple test run should provide

sucient information as to whether or not a compiled program is fast enough

Application Areas for ALDiSP

There are two more or less distinct areas where a language like ALDiSP can b e employed

Specication and Simulation

During algorithm development sp ecications are intended to b e as succinct and precise as p ossible

At this stage of its lifecycle the algorithm itself may still b e little optimized and the simulation

pro cess is primarily concerned with correctness esp ecially concerning the numerical b ehaviour not

with sp eed Signals are represented in oating p oint so that rounding and cuto errors need not b e

considered Then the costerror tradeos of various xedp oint implementations are studied To make

this p ossible the development system has to provide facilities for bittrue simulation of algorithms

on the development hardware usually some kind of workstation

Code Generation

RealTime DSP applications are often implemented using otheshelf DSP hardware These are

pro cessors adapted to a wide variety of DSP algorithms but generally built around a MACmultiply

with a co ecient and add the result to an accumulator datapath Examples of these are the Motorola

x and x and the Texas Instruments xx families of pro cessors

When generating co de for such pro cessors the output of a compiler must b e of comparable quality

to that a human programmer would write As common C compilers do not generate co de that meets

Chapter Introduction

the sp eed requirements most DSP programming is still done directly in assembly language usually

supp orted by large libraries of macro denitions

The ALDiSP compiler ac is therefore targetted at low to mediumthroughput algorithms as they are em

ployed in typical telecommunications audio and control applications Such systems are often implemented

on a basis of commercial DSP pro cessors or customized programmable cores

The ac compiler can b e used for b oth bittrue simulation and ecient compilation which is of course

the main goal of the compiler b ecause the central transformation and optimization phase of the compiler

consists of a partial evaluator based on an abstract interpreter The latter is for all practical purp oses a full

interpreter for ALDiSP Enhanced with a user interface and a simulation environment that provides an IO

mo del this interpreter can b e used as the core of a simulation engine

Overview of Compiler and Thesis

The compiler consists of the following phases eachofwhich is describ ed in one or more chapters of the

thesis

Parsing An ALDiSP program is parsed into an abstract syntax tree AST No optimizations or

simplications are done at this level AST syntax exactly mirrors the ALDiSP syntax Chapter

describ es the internals of the parser and howitinterfaces with the following compilation stages

Conversion into Lambda Form The Lambda Form LF is the intermediate representation employed

throughout most of the following stages of compilation LF is more general than the AST and employs

fewer syn tactic forms A formal semantics for LF is given in chapter the semantics of ALDiSP itself is

given indirectly by the transformation rules in chapter These rules sp ecify how an AST is translated

into an LF program of equivalent semantics The LF semantics employs a state that is passed around

as a parameter of the evaluation functions This state contains the current denitions of all references

Simplication of the LF Program A set of semanticspreserving transformations is presented in the

second half of chapter Some of these transformation are applied to the initial LF program others

are used during various stages of the partial evaluation pro cess

Abstract Interpretation The abstract interpreter AI is a generalization of the standard interpreter

which in turn is the direct implementation of the formal semantics The AI executes the program on

abstract input and together with an abstract scheduler AS generates the set of all states that

can b e encountered when the program is run on actual data

Rewriting the Program During its run the AI has accumulated information ab out all function ap

plications in a cal l cache the AS has likewise created a graph that contains all p ossible states Using

this state graph as a skeleton the program is recreated The call cache provides the information

Due to the presence of aliasing problems unrestricted p ointers C is a language very illsuited for the ecient compilation

of DSP algorithms singleassignment languages like SILAGE SISAL or the functional subset of FORTRAN

are much b etter equipp ed for this task

Typical lowthroughput algorithms are sp eechprocessing tasks that have to cop e with sampling rates of kHz or less and

orbitwords Typical mediumthroughput algorithms work on highquality sound signals CDquality sound needs

kHz bit words Everything in the megahertz range is considered highthroughput

Suchaninterpreter would b e to o slow for practical simulation work since it still has to lug around the extra load of internal

b o okkeeping needed for co de generation and recursion detection

Both abstract interpretation and abstract interpreter will b e abbreviated as AI this should not cause undue confusion

Chapter Introduction

necessary to create sp ecialized function denitions The program is rewritten in a new intermediate

representation the Code Form Chapter explains the Co de Form and how it is generated

Dumping the Program Finally the Co de Form program is transformed into the backend format M

that represents an abstract machine language The state variables of the Co de Form are absentfrom

the M program M uses only basic data typ es integers and oating p oints comp ound data structures

are either atomized or globalized A prototypical implementation of an MtoC backend rather a

virtual M machine written in C is used to test the M programs Chapter gives an overview of M

and the MtoC backend

The combination of abstract interpretation abstract scheduling and rewriting is an instance of Partial

Evaluation PE It makes up the ma jority of the compiler Chapter presents a general overview of PE

techniques Chapter gives an intro duction into abstract interpretation and explains the algorithms used

for lo op detection and generalization Chapter shows the integration of state into the AI and how the

abstract scheduler works

Goals and Results

This work was undertaken with a numb er of goals in mind

There was the very practical need of having available a stable ALDiSP compiler that works correctly

and adequately fast This goal was not b e fully reached since the current compiler is still to o slow

for interactivework

It should b e shown that a language like ALDiSP ie a functional language with many inherently

slow features a complex evaluation mechanism and realtime constructs can b e compiled into

ecient co de This goal was fully reached all of the advanced language features can b e removed

alb eit at high costs in compilation resources and compiler complexity

The feasibility of partial evaluation as a fully automatic centerpiece of an optimizing compiler should

b e explored The literature on AI and PE presents many cases where these techniques are used

to realize very sp ecic optimizations such as compiletime garbage collection deforestation strictness

analysis and numerical optimizations Some of these are restricted in their applicability others are not

fully automated ie need the supp ort of additional annotations that guide the sp ecializing pro cess It

was found that partial evaluation is an enable technology for languages like ALDiSP since it is probably

imp ossible to compile a language with a complex semantics into ecientcode without a large amount

of compiletime reduction Still PE is no magic bullet the design of the abstract domain directly

mirrors the optimizations that can b e achieved and the common optimizations inlining etc still

have to b e done alb eit at a later stage of the compiler

This Co de Form started out as a restricted Lambda Form ie a prop er syntactical and semantical subset of LF but it

underwent ma jor mo dications during the actual implementation of the co de rewriting phase One ma jor dierence b etween

the Lambda Form and the Co de Formis that the latter supp orts explicit statestatevariables are threaded throughout Co de

Form expressions

A previous simulator was written in C and Scheme It did no precompilation ie was written to implement a direct

semantics for ALDiSP The negative exp eriences made with this simulator it was b oth buggy and horribly slowserved as

imp etus to intro duce the Lambda Form that is central to ac

This mighthavebeenavoided by using a graphbased intermediate representation but the ensuing tasks related to linearizing

the graph would have solved the same problems in a dierentguise

Chapter Introduction

Related Work Languages

In the industrial eld many DSP programs are still handcrafted in assembly languages Thirdgeneration

languages like C and ADAhave b een adapted to some DSP architectures by adding new keywords eg mem

ory layout directives andor restricting the languages by banning p ointers recursion and sometimes even

datadep endent lo ops Many sp ecialpurp ose DSP languages are describ ed in literature

most of them are based on the synchronous streampro cessing mo del There is also some overlap with hard

ware description languages since the synchronous dataow mo del is also a natural description of

clo cked digital cicuits Only few approaches are based up on asynchronous dataow concepts Theoret

ical foundations in dataow languages are based up on languages like Lucid thathave b een develop ed

indep endently from the DSP application area One of to days most common DSP languages is SILAGE

and its commercialized successor DFL It is characteristic for a class of languages that oer sim

ple translation into ecient co de at the cost of reduced linguistic abstractions no recursion no overloaded or

p olymorphic function denitions constant or constantb ound lo op ranges synchronous base mo del with up

and downsampling extensions and sampled asynchronicity asynchronous events mo delled by synchronous

streams using no data tokens

Related Work Partial Evaluation

Partial Evaluation PE is not a new invention the term was coined in in the context of coping with

incomplete information in LISP evaluation First PElike optimizations can b e traced backtothe

REFAL compiler andTurc hins sup ercompiler one theoretical milestone was the denition of

the Futamura pro jections cf section Due to its origins in LISP evaluation most early PE systems

were based on the online principle ie they reconstructed the program during its execution An example

is the REDFUN system Online PE is slow and errorprone most of the later researchwentinto

the direction of oline PE which is based up on a few transformations the two transformations foldand

unfoldare sucien t that are guided byabinding time analysis BTA Oline PE

is vastly sup erior in terms of sp eed and simplicity of the partial evaluator itself oline systems also were

the rst to implement nontrivial selfapplication whichisanimportant theoretical goal It is however not

p ossible for oline PE to achieve the level of optimizations that are p ossible in online PE Multivariant

oline PE tries to reclaim some ground but involves more complicated BTAschemes Current researchis

mostly concerned with improving the sp eed robustness and precision of BTAschemes most of whichare

based up on abstract interpretation

Related Work Partial Evaluation for Numerics and DSP

Berlins articles are the rst widely known references that describ e the application of PE metho ds

to the area of scientic programming They describ e a LISP framework in which PE is mainly employed

to unroll lo ops and intro duce software pip elining The application is an nb o dy simulator that is partially

applied to the known numb er and masses of the b o dies By concentrating on the inner lo ops that are

typical for numerically intensive programs impressive p erformance increases are achieved

Meyer applies PE to an imp erative language targeted at DSP applications The language is Pascallike

but restricted in the usual problem areas namely recursion and p ointer handling

The distinction b etween oline and online PE blurs when multivariant oline PE is considered the BTAandmultivariant

generation part of the oline PE can b ecome as complex and timeconsuming as a full online PE

The nb o dy problem is to predict the vector and p osition of n masses at a time t k when their p ositions and vectors at

time t are known

Chapter Introduction

Stylistic Questions Thanks c

Never express yourself more clearly than you think

Niels Bohr

Ihave tried to make this text as readable as p ossible To this extent many examples and explanatory

fo otnotes are provided I am indebted to the p eople that help ed in pro ofreading in alphab etic order

Guido Dunker Andreas Fauth Alois Knoll Carsten Muller Susan Wegner for p ointing out manytyp os

misleading details and takenforgranted assumptions Sp ecial thanks to Carsten for mentioning the A

normal form without it I would have reinvented a minor wheel another time to Guido for providing a

nave reader to Andreas for patiently listening to my ramblings and to Alois for the initial pro ject idea

and the realworld p oint of view None of the remaining factual or stylistic errors are theirs

This work was supp orted by an ErnstvonSiemens stip end Sp ecial thanks to Dr Andreas von Zitzewitz

for helping to provide this funding and Dr Gross for extending it another year

Chapter

ALDiSP A Primer

A language that do esnt haveeverything is actually easier to program in than some that do

Dennis M Ritchie

In this chapter the most imp ortant syntactic and semantic features of ALDiSP are describ ed An extensive

suite of typicalexample programs can b e found in The original ALDiSP denition and rep ort are

and

Lexical Structure

An ALDiSP program is a single ASCI I le Comments start with and end with the newline symbol

Prepro cessor declarations start with a p ercent symbol in column and include op erator symb ol syntax

declarations like INFIX and a le inclusion directive INCLUDE

Identiers are alphanumeric or symb olicstrings consisting of An identier in single quotes

loses its inx status any sequence of characters b etween backquotes isasymbolic constant of undened

semantics

All identiers are casesensitive while keywords are not

Overall Structure

An ALDiSP program consists of an arbitrary number of sequential denitions followed byanet list a

initializing statement recursive set of denitions followed byan

denitions

net

denition of a data ow net

in

expression

The INCLUDE directive is new ie not mentioned in it turned out to b e convenient for implementing libraries

Restricting parser declarations to start in column makes it p ossible to use as the mo dulooperator

The concept of symb olic literals is also new It is used by the systems implementor as a means of communicationbetween

library functions and the compilerinterpreter For example primitive functions are denoted by such literals

This feature is unique to ALDiSP no other language known to the author b ehaves this way While keywords can b e written

upp ercase lowercase or mixed all identiers havetobewrittentheway there were dened

Chapter ALDiSP A Primer

Evaluating the denitions that precede the net maynotchange the state the net list and the initializing

expressions can have sideeects This guarantees that the sequential denitions can b e compiled indep en

dently

Evaluation Strategy

The evaluation of function applications is p erformed callbyvalue the arguments are evaluated lefttoright

The sp ecial op erator delay has the same functionalityasthe delay in Scheme it freezes the evaluation

of its argument expression and creates a placeholder ob ject called a promise While in Scheme a sp ecial

primitive force is required to access the value of the promise ie to initiate the evaluation in ALDiSP

the evaluation is started automatically the rst time a strict primitive is applied to the promise Upon

completion of the evaluation the result replaces the promise

Basic Typ es

ALDiSP provides a variety of builtin typ es and typ e constructors

signed and unsigned integers oating p oint xed p oint and complex numb ers All numeric typ es can

b e parameterized by size

two dierent kinds of arrays vectors onedimensional arrays and matrices twodimensional arrays

Arrays may contain elements of arbitrary typ es including other arrays Vector literals are written as

x x

n

nite lists that maycontain elements of arbitrary typ e List literals are written as x x Basic

n

list op erations are head tail and

innite lists that maycontain elements of arbitrary nite typ es Inputdriven innite lists are called

pip es outputdriven innite lists are streams

AutoMapping

When a function of typ e is applied to an argument list of typ e list list the

n n

function is automatically mapp ed to the lists with typ e list resulting For this to work correctly the

lists must b e of equal length

Sideeectsarechanges to the state The concept of state is explained in section Examples for suchchanges are the

creation of susp ensions and all IO op erations

Such a separate compilation is not implemented in the current compiler

The alternative callbyneed lazy evaluation was considered harmful b ecause the presence of side eects makes an explicit

ordering mandatory While it is p ossible to mix lazy evaluation and side eects we consider such a mixture to b e extremely

confusing to the programmer In addition common wisdom is that strict ie callbyvalue languages are easier to compile

eciently than lazy ones Compilation of lazy languages tends to b e based on a graph reduction mo del that is extremely

unsuited for realization on DSP hardware since it relies on the presence of a large heap and an ecient garbage collection

scheme

Some Scheme implementations do so since allows it It is however a nonstandard b ehaviour on which the programmer

cant rely

All ob ject sizes are measured in the numb er of bits used to representanobject

This restriction creates a small problem when innite lists are considered It is neither p ossible to check the length of such

a list nor in general to prove that it is innite The semantics takes a practical stance Each list that contains a susp ension or

promise in tail p osition is assumed to b e an innite pip e or stream If in the course of further evaluation an end is encountered

this is considered a runtime error

Chapter ALDiSP A Primer

If only some arguments are lists of equal length then the nonlist arguments are expanded to lists ie

in LISP notation evaluates to

The same mechanism applies when the arguments are arrays vectors or matrices the arrays must b e of

equal size

The original ALDiSP denition did provide for nested automapping ie things like

This turned out to b e to o confusing detecting programming errors b ecame much harder b ecause to o many

erroneous programs were accepted by the compiler

Susp ensions

The suspension is ALDiSPs sole allencompassing realtime construct It plays a central role in the language

enabling callbyavailability inputdriven computation asynchronous event handling and timing b ehaviour

to b e mo delled

An expression of the form

suspend expr until cond within lower limit upper limit end

evaluates to a placeholder ob ject the susp ension When after some time has elapsed the cond b ecomes

true conds are usually tests dep ending up on global state tests for the evaluation status of other susp ensions

or the literal true the expression will b e evaluated within the time range dened by the limits The time

range is taken relative to the p oint of time when cond b ecomes true if it is true to start with this is the

time the susp ension is created After the evaluation of expr the result replaces the placeholder just lik ea

promise is replaced by its value

The evaluation mo del of the program as a whole is depicted in g The current state is made up from

?

Waiting for Waiting for Evaluate Finished

Condition Time

Figure susp ension state mo del

three p o ols containing susp ensions with one currently executing susp ension Newly created susp ensions are

placed in the waiting p o ol When their cond b ecomes true they are transferred to the ready p o ol a

timing counter is attached to each of the ready susp ensions and initialized with an arbitrary p oint of time

within the time range of its susp ension

When the expression of the current susp ension is nished a pro cess in whichmany new susp ensions may

have b een created and put into the waiting p o ol the result is placed in the result p o ol and some

Chapter ALDiSP A Primer

arbitrary susp ension whose timing counter is is selected from the ready p o ol to b e evaluated next If

there is no such susp ension the time is advanced the counters of the waiting susp ensions are decremented

and the conditions are tested Also all IO is p erformed during the timeadvance The whole program

stops when the rst two p o ols are empty and the evaluation of the current susp ension terminates Most

realtime programs never terminate

If this description seems to evince a very inecientevaluation mechanism it should b e rememb ered that

a susp ension is roughly equivalenttoaninterrupt handler creating the susp ension installs an interrupt

vector evaluating it means that the interrupt is raised and handled Susp ensions that wait up on IO events

can indeed b e implemented as interrupt handlers

Blo cking Access

Whenever a strict primitive function accesses an unevaluated susp ension the current pro cess is susp ended

until the data dep endence can b e resolved While there is no such notion as pro cess in ALDiSP the net

eect of cascaded blo cking amounts to the same This blo cking access prop erty of primitives can b e mo delled

as a textual transformation

prim obj suspend prim obj until isavailableobj within ms ms end

Here isavailable is the primitive function that tests whether an ob ject is accessible or an unevaluated

susp ension As an example for howthismechanism works imagine the following program fragment

let a suspend fx until g within ms ms end

b a

c ab

d

in

cd

end

This evaluates as follows a ev aluates to a susp ension s this shall b e written a s The expression a

evaluates to a susp ension s that waits for s to complete this shall b e written a s s s The

whole evalution pro ceeds as follows

a s

a s s s s

a s s s s

b s

ab s s s s s s

c s

d

cd s s s s

The virtual time advance of the evaluation mo del is implemented as waiting for the next clo cktick It dep ends up on

the sp eed of the compiled program whether this virtual time can b e as fast as the real time

In a way isavailable is the nondeterministic primitive without it there would b e no way to implement the infamous

fair merge function

mergesigsig

suspend if isavailablesig then hdsig mergesigtlsig

else hdsig mergesigtlsig

end

until isavailablesig or isavailablesig within ms ms end

Chapter ALDiSP A Primer

The end result is a chain to b e precise a directed acyclic graph of susp ensions If the rst susp ension

s is resolvedtherestuptos will follow sequentially This b ehaviour of blo cking chains resembles

that of a pro cess

Sequences

A sequence contains a list of declarations andor expressions that are evaluated in sequential order Its main

use is in statechanging co de b ecause there is a synchronization mechanism built in If an expression in

a sequence returns a susp ension the rest of the sequences is blo cked until the susp ension is resolved An

exemplary sequence is

seq

xxxhandleopenfilexxx

xreadfromfilexxxhandle

closefilexxxhandle

x

endseq

The statements are evaluated sequentially the result of the last expression is returned The closefile

line will probably evaluate to a susp ension that is resolved when the le is closed

Multiple Return Values

Functions mayhave more than one return value Multiple return values are not tuples whichwould b e

values of their own Instead each function can return anynumb er of result values A function that returns

more than one value can only b e used in a multiple variable denition

That is in the following example

func divmodab a div b a mod b

xy divmod OK

divmod ERROR NOT div mod

xy divmod ERROR

the last two lines are erroneous

Conditionals

ALDiSP supp orts the traditional conditional construct if The elsif symb ol is written as the else

clause is not optional since if is an expression not a statement The syntax is

if cond then expr

cond then expr

cond then expr

n n

else expr

endif

The conditions are evaluated from top to b ottom if a condition evaluates to a susp ended value the whole

expression b ehaves as a primitive function ie forces an evaluation or blo cks

Chapter ALDiSP A Primer

Lo cal Denitions

The lo cal denition construct let has the syntax

let decl decl

n

in expr

end

Declarations are evaluated sequentially from left to right They cannot blo ck if in the course of the evalu

atiuon of a declaration

let var expr

in expr

end

the expr blo cks the var has as its value the resulting susp ension The expression expr will only blo ckif

its evaluation contains a strict use of var

Checking and Casting Typ es

ALDiSP supp orts a runtime typ echecking mo del that allows verication and testing of arbitrary values

and typ es An expression

typ expr

is a typedeclaration in which the programmer describ es the typ e of the ob ject that results from the evaluation

of the expression If the ob ject is not of the declared typ e a runtime error should b e signalled if an

interpretationsimulation is in progress or the program may b ehave undened if the program is compiled

with the resp ectivechecks turned o Typ e checking can also b e used to reduce the domain of functions by

giving them a restricted typ e

typ expr

is a typecast or conversionEgReal converts the integer to the semantically equivalent oat

ing p oint representation Additional conversion functions may b e installed by the user by declaring the

overloaded function castgeneral with one argument

Checks and casts are strict if a typ e checkcast is applied to a promise it is forced if it is applied to a

susp ension the whole op eration blo cks

Exceptions

Like many other programming languages ALDiSP provides constructs to cop e with exceptional b ehaviour

In addition to this purp ose exceptions are used to mo del lo cally deviating numerical b ehaviour This means

There exists quite a notational confusion in this area it is by no means obvious what the dierences b etween conversion

co ercion and casting are Usually casting is the retyping of a value by giving its representation a new interpretation

while conversioncoercing actually implements some kind of valuepreserving function that changes the representation of a

value and its typ e

As an exception checking and casting against Obj ie the typ e of all p ossible ob jects is a noop

Chapter ALDiSP A Primer

that not only error conditions division by zero etc are handled by exceptions but that the same language

construct also handles rounding mo des and overow handling For example

guard abc

on Roundxn x

on AdditionOverflowx writeToPortstderrOverflow

returnMaxInt

end

evaluates abc with a cuto rounding mo de and a signalling o verow handler While the Round exception

returns to the place where it was called the Overflow exception ab orts the evaluation of abc and returns

MaxInt as the total result of the guard form

There is no sp ecial syntax for raising an exception it is called likeany other function

Lists Streams Pip es

The basic abstraction in any DSPoriented language is the signalsynchronous languages like Silage

use current sample and kth previous sample signal variables ALDiSP uses the list abstraction

the head of the list is the current sample the following element is the next one and so forth

As lists are usually nite ob jects some imagination is necessary to understand the notion of lazy innite

lists

ALDiSP has b orrowed from other functional languages the stream approach to implement innite lists a

stream is a p otentially innite list whose rst k elements are explicitly present the rest are mo delled by

a generator function that can deliver anynumb er of additional elements

A simple example is a sine generator

func sinusndelta

let newval ndelta

in

sinn sinusif newvalPi then newvalPi else newval end

delta

end

sin sinus

Here is a lazy list constructor with its second argumen t delayed The op erator is dened in terms

of delay and the primitive list constructor cons

RINFIX infix right associative

macro ab consadelay b

Rounding and overow handling is of paramount imp ortance in DSP esp ecially in maximizing output accuracy of the

system One of the most frequently used mo des b esides simple cuto is saturation arithmetic where the deleterious eects

of an overow are reduced by causing the output value to saturate at the maximum p ositivenegativevalue The ALDiSP library

also provides round to zero round to even etc To our knowledge ALDiSP is the rst language that uses exceptions to

implement dierent rounding mo des Usually some global state parameters or parameterized numeric functions are needed to

mo del dierent b ehaviours

Round is the predened exception to b e called when a result cannot b e represented exactly within the sp ecied precision

its arguments are the expressible part of the result numb er plus the rst following bit In the example this bit n is simply

ignored

There is a change from the original ALDiSP denition where return marked the returning case and the jump backtothe

guard was the default

Version of Silage handles multirate and asynchronous systems byintro ducing explicit upsampling and downsam pling facilities

Chapter ALDiSP A Primer

It is necessary to dene as a macro b ecause function calls are b yvalue A callbyvalue op erator would

of course defeat the whole idea of a lazy datatyp e constructor

For asynchronous realtime applications streams provide insucient expressivepower input or data

driven applications cannot b e implemented using them sopipes were intro duced Pip es are inputdriven

streams implemented as chains of susp ensions For example

proc pipeFromRegisterreginterval

suspend consreadFromRegisterreg

pipeFromRegisterreginterval

until true

within intervalinterval end

net

input pipeFromRegisterreg ms

output somefilterinput

in

writeToRegisterregoutput

pipeFromRegister delivers a pip e generated by sampling the contents of a register reg at dened p oints

in time some lter function is then applied to this stream b efore it is written back to an output register

reg

Due to the blo cking prop erties of pip es which are basically a consequence of the blo cking prop erties of

the susp ensions that implement them they can b e used b oth syntactically and semantically like streams

This similarity makes it p ossible for the programmer to subsitute streamgenerating functions for realtime

inputs whichcanbevery convenient when testing algorithms

Declarations

Declarations can b e used to intro duce arbitrary values abstract datatyp es exceptions new functions and

general typ es

maxint

abstype Nat Null SuccPredNat

abstype ListOfx Empty ConsCarxCdrListOfx

exception Overflowx erroroverflow maxint

func multOfnx Intx and x mod n

type multOf multOf

The last lines of the example present one of the most unique and irritating to implement features of ALDiSP

Just as every typ e can serve as the predicate that detects the ob jects of this typ e this is used in the Intx

expression any predicate can b e used as a typ e

Combined with the typ e declaration construct this dualitymakes it p ossible to intro duce arbitrary assertions

into the program These can b e used b oth during simulation as a testing to ol and during compilation as a

source of hints to the compiler

Indeed macros were only intro duced to make it p ossible to abstract from nonstrict expressions

They can but only with one of the following hacks

sp ecial noninputvalues can b e provided this amounts to busy waiting

an explicit global scheduler that emits a stream of there is input on p ort x information can b e installed

Both approaches cannot provide acceptable p erformance in systems consisting of manyasynchronous signal sources that are

triggered with a low probabilityAtypical example for such a system is a telephone exchange

The current implementation do es not utilize typ e annotations in this way

Chapter ALDiSP A Primer

Mo dules

ALDiSP supp orts a simple mo dule facility as illustrated by the following example

module xx

export a Int Int

b

casd

func an n

b

c a

endmodule xx

import a Card Card from xx

import b as bb from xx

y xxdbb y

When an ob ject are imp orted and exp orted it can b e given a new name or it can b e retyp ed Retyping

amounts to typ e checking using the type op erator

Functions

Wehave already seen how function symb ols can b e dened as inxop erators which is a purely syntactic

feature

Functions can also b e overloaded

import from Integers

overloaded import from Floats

import overloaded PointInfinity from Complex

An overloaded importoverloads all values that are imp orted while an overloaded marker in the

imp ort list overloads only the immediately following ob ject

Overloaded functions need not have the same aritynumb er of parameters

func mapfa map

let func loopa if a is null then null

else fhead a looptail a end

in loopa

end

overloaded func mapfab map

let func loopab

if a is null or b is null

then null

else fhead ahead b looptail atail b end

in loopab

end

The co de shown ab ove uses the predened op erator is to test whether its rst argumentwas generated using

the datatyp e constructor that is its second argument head and tail are the usual list selectors

Overloaded functions are nonrecursive As a consequence map has to b e dened in terms of an ancillary

loop function b ecause the only map visible in the scop e of map is the old map

I had dened a semantics for overloaded recursive functions but it got to o complicated for practical use The restriction

shouldnt cause any practical problems

Chapter

The Parser

A formal parsing algorithm should not always b e used

D Gries

Implementing the frontend scanner and parser of a compiler is usually considered a more or less trivial

task Lo oking at the issue in greater depth though scanners and parsers are found to b e full of details

that tend to b e managed byintro ducing adho c solutions the scanner and at the other end of the

compilation tra jectory the co de generator is one of the few parts of a compiler that has to interact directly

with the op erating system scanner and parser havetoprovide essential information for error diagnostics

many languages have dark corners in their lexical or syntactical structure that make trouble when using

automatic frontend generators like lex and yacc lastly those to ols often tend not to b e rich enough in

their functionality so that hacks are needed to reach the desired goals

The development of the parser describ ed here was inuenced by the rst ALDiSP parser that was develop ed

in using lex and yacc in C That parser created a textual output representing the syntax tree in

LISPlike notation for example

net

in

aif bcd then efg

h then

else

endif

endnet

was transformed into

PROGRAM

LOCAL

APPLY ID a

IF APPLY ID b ID c IDd

APPLY ID e ID f ID g

IF ID h

CONST

CONST

This tree was used as input to an ALDiSPsimulator written in Scheme

This simulator wasavery complex aair b ecause the abstract syntax was not simplied prior to simulation the goal was

to write a direct ALDiSP evaluator The simulator worked and was used to test the interaction of the basic language features

but its slowness rendered it impractical for interesting programs When using it Turners ap o cryphal comment ab out an early

version of Miranda executing with the sp eed of continental drift came to mind

Chapter The Parser

In the next sections this rst parser will b e mentioned whenever a design decision was made b ecause of

exp eriences with it

Overview

The ac compiler is implemented entirely in Standard ML and has thus a welldened b ehaviour except

in the area of IO which is systemdep endent

The frontend was develop ed using the smllex and smlyacc to ols these have sp ecication languages

similar to their C counterparts but their implementation interfaces are quite dierent UNIX lex and yacc

generate two main functions yylex and yyparsewhich deliver resp ectively the next token co ded as a

small integer or the next parse result Their SML counterparts communicate via token streams A great deal

of time was sp ent in learning how to manage this interface esp eciallyhow to implement the prepro cessor

as a lter function for these streams

The parser consists of three parts lexical analysis prepro cessing and parsing A symb ol table is maintained

by the prepro cessing phase which also handles the prepro cessor declarations these are the xity declarations

infix prefix and postfix and the le inclusion directive An overview is given in gure

ASCI I abstract

token token

program lexer prepro cessor parser syntax

stream stream

text tree

symb ol table

Figure Parser Submo dules

Lexical Analysis

Lexical analysis is trivial prepro cessing declarations like infix are tokenized but otherwise ignored The

resulting token data typ e is

datatype lexresult

STRING of string

CONST of string

FLOAT of real

INT of int

ID of string

QID of string

SEP of string

ELIF TYPDECL VECTORCONST LISTCONST

INFIX PREFIX POSTFIX RINFIX INCLUDE

IGNORED NEWLINE EOF

The implementation used is SMLNJ those few additions to the standard that it oers and are used by ac concern

the arrayinterface and conform with an unocial agreement of all ma jor SML implementors to day ac should therefore b e

p erfectly p ortable

The designers of smllex and smlyacc seem to have missed the necessity of putting a lter b etween these two to ols and

therefore forgot to b etter do cument this interface Some exp erimentationwas needed to create a typ esafe solution that did

not violate the complex mo dule interface

Chapter The Parser

term INFIXL of SymbolT INFIXR of SymbolT

PREFIX of SymbolT POSTFIX of SymbolT

ID of SymbolT

CONST of string

STRING of string

INT of int

FLOAT of real

NET IN ENDNET LET WHERE ENDLET MODULE

ENDMODULE IF THEN ELSE ELIF ENDIF DELAY SUSPEND

UNTIL WITHIN ENDSUSPEND IMPORT FROM AS EXPORT

GUARD ON ENDGUARD SEQ ENDSEQ RETURN VAL

FUNC PROC TYPE EXCEPTION ABSTYPE OVERLOADED

MACRO REC END ENDREC TYPCAST VECTORCONST

LISTCONST TYPDECL ARROW BRACKCL PARCL

BRACEOP BRACECL BAR PAROP COMMA EQUAL

COLON DOT SEMI EOF

Figure Parser Input Token Typ e

All separators comma semicolon etc are represented as SEP tokens Tabs and spaces are absorb ed New

lines are passed along as tokens b ecause they might terminate lines that contain prepro cessing declarations

A QID is an identier enclosed in single quotes which disso ciates it from any xity attribute The token

stream that enters the prepro cessor knows no tokens that representkeywords as keywords are recognized

during prepro cessing A CONST is a symb olic constant ie a string enclosed in backquotes

Any input line that starts with a whic hisnot followed by one of the known prepro cessor declaration

keywords is ignored an IGNORED token is passed along instead and the rest of the line is skipp ed The lexer

do es not generate error messages only the prepro cessor and parser do so

Prepro cessing

The main function of the prepro cessor is the analysis of prepro cessor declarations of which there are currently

two kinds textual le inclusion include and symb ol xity declarations infix prefix postfix

The include directive is easy to implement The prepro cessor nds op ens and closes all les it calls the

lexer with the op ened le handles and transforms the resulting token stream into the token stream suitable

for the smlyacc parser To include another le a new incarnation of the scanner is applied to a character

stream representing the included le the resulting tok en stream is then inserted into the original token

stream replacing the include directive

The xity declarations dene the asso ciativity and precedence of tokens that can b e used as unary and

binary op erators The smlyaccgenerated parser needs distinct tokens for the separators and the keywords

it cannot cop e with literal strings like standard C yacc can The typ e of tokens cf g suitable as

input to the parser is generated by the following rules

SEP COMMA SEP SEMI etc for all punctuation characters

IDnet NET IDif IF etc these string comparisons are caseinsensitive cf chapter

Chapter The Parser

IDx IDSymbolinternx if there is no xity declaration for x The structure Symbol

implements the global symb ol table Symbolintern enters a string into the table

IDx INFIXLx if the prepro cessor has encountered a xity declaration in this case left

asso ciative binding strength for x

Fixity declarations are parsed by hand Errorhandling is unsophisticated in case of an unexp ected

token an error message is emitted and everything up to the next NEWLINE is discarded The xity

information is stored in the symb ol table

A few tokens are passed along as they are TYPDECL stands for TYPCAST for VECTORCONST

for and LISTCONST for

The handling of prepro cessing directives do es not employany formal metho d it is written as a functional

automaton ie a set of tail recursive functions The standard action is to copy input to output elementwise

applying the transformations Only when encountering a prepro cessing directive the arguments are parsed

upto the next newline The xity declarations are stored as ags in the symb ol table

The Parser

The parser directly mo dels the grammar as it is presented in and The grammar is alse reprinted in

app endix B Each nonterminal rule returns an abstract syntax tree AST expression The abstract syntax

mo delled as a set of recursiv ely dened data typ es is given in g in the form of an SML data typ e

declaration

The parser converts the concrete syntax into the abstract syntax without p erforming any signicant trans

formations There is only one nontrivial transformation p erformed by the parser which is connectedd with

the treatmentoftyp ed inx function declarations

When a function declaration is parsed that is written in inx form and contains an explicit typing annotation

the parser is in trouble The abstract syntax for function declarations mirrors the concrete syntax of the

standard noninx function declaration which has the general form

overloaded func name arg typ typ expr

Note that typ is the result typ e of the function ie the typ e of values that result from evaluating expr The

generated term mo dels this form directly

AparamdeclposisOverloadedAfuncarg typ nameexpr

In contrast an inx declaration has the form

overloaded arg op arg typ expr

Notice that typ denotes the typ e of the function as a whole The problem results from the choice of distinct

argumen t and result typ e expressions in the abstract syntax how can the separated typ es typ typ be

n

generated from this one typ e expression After all typ can b e an arbitrary expression

This table is a very nonfunctional thing but passing it along as an extra parameter would have complicated the imple

mentation of the parser without gaining much in abstraction

For those unfamiliar with SML a short example datatype A BC D declares a typ e A to havetwo constructors B is

a unary constructor with an argumentoftyp e C while D is a nullary constructor

Chapter The Parser

datatype Aprogram

Aprogram of Adecl list Adecl list Aexpr

and Adecl

Asimpledecl of pos bool id Aexpr

Aparamdecl of pos bool Aheader Aparam list list id Aexpr

Amultidecl of pos id list Aexpr

Amoduledecl of pos id Aexport list Adecl list

Aimportdecl of pos id Aexport list

Aabstypedecl of pos Aparam list list id Aconstructorlist

Arecdecl of pos Adecl list

and Aparam

Aparam of id Atype

and Aexport

Aexport of pos bool id id Atype

and Atype

Afunctype of Atype list Atype list

Aexprtype of Aexpr

and Aexpr

Aapp of pos Aexpr list

Aseq of pos Astmt list

Atupel of pos Aexpr list

Acond of pos Aexpr Aexpr Aexpr

Avar of pos id

Asymbolic of pos string

Aconst of pos string

Aint of pos IntReprT

Afloat of pos FloatReprT

Astring of pos string

Acast of pos Atype Aexpr

Acheck of pos Atype Aexpr

Adelay of pos Aexpr

Asuspend of pos Aexpr Aexpr Aexpr Aexpr

Aguard of pos Aexpr Adecl list

Alocal of pos Aexpr Adecl list

Areturn of pos Aexpr list

and Astmt

Aexpr of Aexpr

Adecl of Adecl

and Aconstructor

Aconstructor of id Aparam list

and Aheader

Afunc Aproc Atype Aexception Amacro

Figure The AST Data Typ e

Chapter The Parser

The solution as implemented by the parser is somewhat inelegant but correct The parser generates the

abstract equivalent of the following expression

overloaded op

let func tmp arg arg expr

in typ tmp

end

This is correct since the semantics of an externalt yp e declaration typexpr is the same as that of an

internal one func nametyp e expr

Blind Alleys

During the development of the frontend quite a few design errors were made detected and corrected Some

of these were detected later when scanner and parser had to b e integrated into the compilers framework

Representation of Literals

How are literals esp ecially numb ers represented internally This question seems trivial but it has far

reaching consequences since many phases of a compiler have to deal with literal values

In most compilers literals are transformed to an internal representation in the lexer strings stay strings

numb ers are transformed to machine form b o oleans are transformed into references to global variables

enumeration literals or numeric representations Esp ecially when semantic actions are p erformed on the

parse tree it is handy to havenumb ers available in a format that corresp onds to the one used within the

compiler itself

This approach leads to problems if a crosscompiler is written as is the case with ac On any given machine

most programming languages supp ort the same numeric formats those directly supp orted by the hardware

Usually it is only in crosscompilers that the implementation language provides a numeric format diering

from that of the ob ject language

ALDiSP allows very large and precise numb ers declarations like

largeconst

pi

are allowed and welldened If only those numeric representations were used that are provided by the

language the compiler is implemented in in this case SMLNJ precision would b e restricted to bit

integers and standard IEEE oatingp ointnumb ers

Alternatively literals can b e represented as strings ie the way they were written in the program This strat

egy while b eing safe turns out to b e quite annoying when transformation rules need to b e implemented

when a lot of integer constants are created and parsed in the transformation phase calls to stringToInt

and intToString crop up everywhere Both for eciency and legibility reasons this is an unsatisfactory

solution

In ac the rightway turns out to b e the usage of the abstract interpreters semantic value representation

cf chapter The abstract interpreter AI implements a generalization of the standard semantics

and provides the backb one for co de generation Because the AI has to b e able to deal with these values

anyway it will knowhow to represent literals and how to apply the primitive op erators to them

An untyp ed declaration is the same as a declaration of Obj the typ e of all ob jects

The AI should at least b e able to do correct constant folding for which it has to b e able to representallvalues that can o ccur in literals

Chapter The Parser

In the current implementation the semantic values dened in the AI do not directly representmultiprecision

integer and xp ointnumb ers Instead the employvalue data typ e dened by separate libraries one each

for integer and oatingp ointmultiprecision values The parser passes the input strings to these libraries

which transform them into their internal format The resultantvalues are stored in the parse tree

Implementing The Prepro cessor

The rst parser had no separate prepro cessor prepro cessing was done using lex states This approach

proved to b e complicated The essential part was quite hard to understand cf g

BEGIN parserline

parserlineINFIXiws

parserlineLINFIXiws basetypeBINOPLcntfixBEGIN firstname

parserlineRINFIXiws basetypeBINOPRcntfixBEGIN firstname

parserlinePREFIXiws basetypePREOP cntfixBEGIN firstname

parserlinePOSTFIXiws basetypePOSTOP cntfixBEGIN firstname

parserlineINCLUDEiws BEGIN includefile

ytextyyleng includefile nt strncpyfilenamebuffery

filenamebufferyyleng

includeFilefilenamebuffer

BEGIN

firstnamenextnameid fixescntfixinternyytextyyleng

BEGIN nextname

firstnamenextnamefixescntfixinternyytextyyleng

BEGIN nextname

firstnamenextnamefixescntfixinternyytextyyleng

BEGIN nextname

nextname BEGIN nextname

nextname if basetypePREOP basetypePOSTOP

yyerrorNo precedence for Pre Postfix

else basetypeyytext

int i foriicntfixi

checkTokenfixesibasetype

tokentypefixesibasetype

BEGIN

nextnameioptwsn int i

foriicntfixi

checkTokenfixesibasetype

tokentypefixesibasetype

BEGIN

Figure An early combined lexerprepro cessor

The current implementationof ac employs a separate prepro cessor which is a lter b etween the lexical

analyzer and the real parser This approach turns out to b e more exible and easier on mo dications

As an extra b enet the prepro cessor can handle the keyword recognition task which earlier was done by

In an earlier version AI semantic values were generated and incorp oratedinto the AST This incurred a problem of

mo dularity since function closures are amongst the values mo delled bytheAIvalue data typ e it has to have access to the

Lambda Form data structures These in turn are generated from the abstract syntax tree Since SML do es not allow recursive

mo dule denitions this cyclic dep endency had to b e broken One p ossible solution was to implement closures as integer tags

to a global closure table thus voiding the need for explicit references to the Lambda Form typ e The separation whichisnow

employed is more elegant since it employs a hierarchy of mo dules at the lowest level are the mo dules that provide for literal

values and op erations on them the AST data typ e uses these values as do es the AI value data typ e

Chapter The Parser

priming the symb ol table a metho d whichwas complicated by the caseinsensitive comparison for keywords

b ecause the knowledge ab out this o ddity had to b e buried in the intern function of the symb ol table

Error Handling

The rst parser had extra grammar rules to supp ort error recovery eg instead of the simple grammar

fragment

if IF exprs THEN stmts else ENDIF

else ELSE stmts

ELIF exprs THEN stmts else

some error pro ductions were intro duced

if IF exprsTHEN stmts elseENDIF

exprsTHENexprs THEN

error THEN

elseENDIFELSE stmts ENDIF

ELSE error ENDIF

ENDIF

ELIF stmts THEN stmts elseENDIF

Theses error rules bloated the whole grammar and made it harder to understand and mo dify

Luckily smlyacc supp orts its own errordetection mechanism that is at least as go o d as the handwritten

error rules used in the rst parser The grammar could therefore b e implemented directly without any

changes

Line Numb ers and File Names

The rst abstract syntax tree did not contain any references to line numb ers First exp eriences with the

abstract interpreter showed that it was quite a chore to lo cate errors in the source program

Surprisinglyittookonlytwodays to add line numb ers to the parser and all subsequent mo dules at that

time CPS conversion and abstract interpretation Each no de of the AST has now an additional attribute

of typ e PosTwhich contains two line numb ers that name the span of a construct and the source le

name The latter is necessary since practical ALDiSP programs tend to include many library les directly

and indirectly

The biggest practical implementation problem was the inability to access a currentlinenumb er of non

terminals within smlyacc semantic rules It was known that the parser keeps track of p ositions but it was

not do cumented how they can b e accessed Some p oking in the generated co de showed what variable names

were generated by smlyaccTopick an arbitrary example the typCast rule is now dened as

YtypCastTYPCAST Ytype ECKZU YcExpr

AAcastmkPosTYPCASTleftYcExprright YtypeYcExpr

The rst line represents the grammar rule

typCast type Expr

closed

The second line contains the SML co de that creates an Acast no de out of three arguments The second

and third arguments are the casts typ e and expression the rst argument represents the current p osition

Chapter The Parser

The constructor mkPos takes two line numb ers as arguments these are the line at whichthe starts

TYPCASTleft and the line at which the YcExpr ends YcExprright The mkPos constructor takes the

current le name from a global variable

CPS Intermediate Form

In the b eginning the parser was intended to create co de in continuationpassing style CPS All necessary

simplication should b e done in the parser ie the CPStransformation should b e distributed over the

semantic actions of the grammar rules

This approach seemed to have quite a few advantages At that stage I thoughtthataCPSformwould b e

the only intermediate representation throughout all symb olic evaluation and co de generation Because each

new data structure requires its own debugging supp ort at least a print function I planned to use only one

data representation whichwas to b e generated directly in the parser The alternative whichisnow used

leads to the creation of a whole data typ e ALDiSP AST that is used only for communication b etween the

parser and the succeeding mo dule That mo dules only task is to simplifying the AST and it into Lambda

Form it is describ ed in chapter

CPS is a form of calculus in which functions never return to their call site Instead each function is given

an explicit continuation which is called with the result of the function As an output language for a parser

CPS has distinct advantages

continuations

Using a CPS form allows to explicitly represen t nearly all p ossible kinds of control ow esp ecially the

continuation capture needed to mo del the b ehaviour of the dierent exception handling and exception

raising scenarios

multiple outputs

Using a CPS manyvalued functions can b e represented easily where usually a continuation function

of one argument is called an nary continuation can also b e used This approach to handling multiple

return values avoids intro ducing a tupletyp e if it is not present in the language itself

Implementing the transformation rules in the parser is straightforward if they can b e sp ecied as attribute

grammar rules with recursivedescentevaluation order since such attribute grammars can b e implemented

very nicely in SML using the named struct facilities oered by the language For example a set of equations

expr appexpr expr def use tree

n

def def

use expr use expr use

n

tree appnodeexpr tree expr tre e

n

with inherited attributes def and synthesized attributes use and tree can b e implemented using a func

tion that takes the inherited attributes and returns the synthesized attributes each wrapp ed up in a data

structure

There are many p ossibilities to sp ecify and implement a CPS transformation cf for a discussion

The exception handler a guard expression has to capture the currentcontinuation which is called when an exception

is raised and returns

Chapter The Parser

app expr exprs

fn INdefSymbolSet

let val funcout expr IN

val argsout map exprs IN

in

use reduce union map use funcout argsout

tree apptree funcout map tree argsout

end

In this co de expr is the value of the expr that was matched by the parser This value is itself an attribute

transforming function Applying the expr and exprs functions to IN amounts to passing the inherited

attributes down funcout and argsout are the resulting bundles of synthesized attributes that were

passed up Using use and tree sp ecic attributes can b e selected from these bundles

To implement a CPS transformation only a single extra attribute is needed

the currentcontinuation expression usually a simple variable But nave CPS transformation algorithms

generate bloated co de A trivial expression like

a

might b e transformed to

lambda tmp tmp

lambda tmp tmp

lambda tmp tmpa

tmp tmp tmp K

a

where K is the surroundingcon tinuation A more sophisticated algorithm knows the dierence b e

tween atomic and nonatomic expressions and generates a K as it should b e To imple

ment such a more rened CPS conversion on a distributedlev el while parsing is done a new data typ e

AtomicOrComplex must b e intro duced and the co de for the transformation rules doubles in some cases

even quadruples in size

CPS Transformation The Morale

The CPS transformer worked and it w as buggy The generated output was to o big to checkbymanual

insp ection

The generation of source co de for the parser with smlyacc was quite fast it to ok less than a minute to

compile the parser sp ecication Compilation time for the generated parser to ok ab out minutes Sun Sparc

I I Mb The mo difycompiletest cycle was therefore in the order of minutes

Hence the rst parser was a complex incorrect piece of software that to ok a long time to compile and that

was very hard to test

After twoweeks of minute turnaround times a complete rewrite was done in three days one day for the

data structure and supp ort functions one day for the parser and one day for implementing a compilation

into Lambda Form A LFtoCPS transformer to ok an additional three days

The ma jor morale is therefore one should try to keep the compilelinktest cycle as low as p ossible and

split the mo dules accordingly

CPS as an intermediate form was later abandoned

Chapter

The Lambda Form

Then anyone who leaves b ehind him a written manual and likewise

anyone who receives it in the b elief that such writing will b e clear

and certain must b e exceedingly simpleminded

Plato Phaedrus

This chapter and the following one formally dene the semantics of ALDiSP programs Since ALDiSP is

much to o complex to describ e in a direct wayanintermediate representation the Lambda Form LF is

intro duced This chapter describ es the semantics of LF programs the next chapter shows the set of functions

that transform abstract ALDiSP programs as generated by the parser into LF programs

The syntactic denition of LF programs is given by a set of SML datatyp e declarations

datatype Lprog Lprog of Ldecl Lexpr

datatype Ldecl

Ldecl of bool Var Lexpr

Lpard of Ldecl list

Lseqd of Ldecl list

Lfixd of Ldecl list

datatype Lexpr

Lambda of int Var Lexpr list Lexpr

Lvar of Var

Lit of SemValsT

Lapp of Lexpr list

Lcast of Lexpr Lexpr Lexpr Lexpr

Lcheck of Lexpr Lexpr

Lcond of Lexpr Lexpr Lexpr

Lselect of Lexpr Lexpr Lexpr list Lexpr

Lcatch of string list Lexpr

Let of Lexpr Ldecl

Lseq of Lexpr list

The semantics are dened in the denotational semantics style A denotational semantics is based on the idea

that eachentity of the language that is to b e describ ed denotes a mathematical en tity eg numeric values

This is the abstract syntax given in the form of an SML data typ e declaration Concrete examples will b e in a hybrid

syntax consisting of ALDiSP abstract LF and semanticlevel expressions The actual data structures employed in the

compiler also contain source le p ositioning information this detail has b een omitted from most of the following material

A detailed intro duction into denotational semantics of programming languages is given bySchmidt

Chapter The LambdaForm

denote numb ers and pro cedures denote continuous functions on domains A sp ecial trickintro duction of

b ottom elementsisusedtomake the functions dened in the face of nontermination In practice many

denotational denitions lo ok like functional interpreters and can b e directly implemented as suchinany

language One of the b estknown denotational denitions is given in the Scheme

rep ort the authors felt condent in its correctness since the semantics in this section was translated by

machine from an executable version of the semantics written in Scheme itself

The ac semantics do es not fully conform to prop er denotational semantics style it contains some global

data structures and mo dels LF functions as explicit inert closures instead of semanticslevel functions

Many languages for which denotational semantics are written are statical ly typed and havetwo semantics

the static or typ e semantics denes the typ es of expressions while the dynamic or runtime se

mantics denes the execution b ehaviour The dynamic semantics is only dened on welltyp ed programs

ie programs for which the static semantics has found correct typ es An example for such a language is SML

Since there are no static typing rules for ALDiSP only an execution semantics exists This semantics contains

explicit and implicit tests for welltyp edness

Tokeep the semantics legible it is written in the style of an SML program It is not directly executable

since it sometimes employs nonconstructive notation such as quantors and metasyntax suchas

In the rst section the domains used by the semantics will b e presen ted In the following sections evaluation

functions are dened for all syntactic forms In sp ecic cases there will b e discussions concerning some of

the more dicult p oints like automatic dereferencing blo cking susp ensions and automatic mapping

The semantics was develop ed in parallel with an interpreter thus they are b oth tested In form and

contents the currentsemantics are a ma jor revision of the semantics published in whichwas done prior

to the interpreter implementation The most imp ortantchanges are

The Result typ e has b een simplied the earlier version had four kinds of results simple results

multiplevalue results exceptions and mismatches the current one only two results and exceptions

Tuples have b een intro duced to remove the need for multiplevalue results

Generalized arrays have replaced the vectors and matrices

The representation of the semantic equations has b een converted from the usual mathematical style

to the SMLlikestyle

Semantic Domains

In denotational semantics the elements of computation what we usually call values do not form a simple

set but rather a domain A domain is a set with an internal structure given by a partial ordering relation

and a least element Together they dene a complete partial order CPO The domain of all ALDiSP

values is called Obj Itisaat domain ie all data elements are uncomparable under the ordering

relation and bigger than the b ottom element

x Obj x w

In a statically typ ed language each expression has a unique typ e that can b e computed without actually running the program

Chapter The LambdaForm

Conventions and Notation

A homogenous list typ e using SML notation is assumed the typ e list of x is notated as x list List

literals are written as x x the nth elementofalistx is denoted as nthxn the length of x

n

is x A list can b e constructed elementwise using the listconstructor ie can b e written as

Environment substitution is written as Envsymbolvalue environment lo okup is Envsymbol Lo oking up

undened names returns

Unimp ortant semantic domains and primitive functions those that playnorole in the evaluation pro cess

are not describ ed eg the existence of oating p ointnumb ers is mentioned but not their semantics

For many purp oses unique tags are needed These are drawn from a set Tags that remains unsp ecied the

only op eration dened on tags is comparison for equality

Whenever an error situation o ccurs an error is returned An erroneous program will deliver an un

dened result the error pseudofunction generates Sometimes the error is attributed with a cause

eg errormismatch this is for do cumentation purp oses only

Variable naming conventions are that E represents environments S states and o or obj ob jects If some

value is incrementally changed this is usually denoted byticking the successivevalues eg S S S

The symbol K is used for continuation functions

If a set consists of one constructor only the constructor will have the same name of the set but in lowercase

Conditional denitions havetheform

if then

then

then

else

The case conditional destructures values

case expr

of pattern expr

pattern expr

else expr

else

The lambda symbol is used in mo deling functions The expression

xyz expr

denotes a function of three arguments that when applied evaluates by substituting the arguments in the

expression That is the op erator denotes anonymous functions muchlikethe lambda sp ecial form in

Scheme or the fn of SML Because the op erator is lexically scop ed parameter names do not matter ie

x x y y

For practical reasons tags can b e thoughtofasmappedtotheintegers In chapter tags are needed that are ordered by creation date

Chapter The LambdaForm

Usually though the names are chosen to have mnemonic value

Whenever an expression like

x x expr

n

is encountered it is assumed that the function dened by the lamb da is only dened on lists names of the

form x o ccuring in expr refer to the ith element of the list and n is the size of the list

i

Data Domains

The basic data domain is that of Objects

Obj Bools Numbers Arrays Tuples References Functions Types fg

It consists of a numb er of standard and some ALDiSPsp ecic sets and is made into a partially ordered

domain byintro ducing a b ottom element The abstract semantics cf chapter will intro duce another

suchvalue top which is more dened than any other ob ject

Bools ftrue falseg

Numbers Sc alars Complices

Complices Scalars Scalars

Scalars Integers Floats Times Durations

Integers FiniteIntegers InniteIntegers

InniteIntegers f g

FiniteIntegers fintn size j n InniteIntegers size MAXINTWIDTH g

Floats IEEE oating point numbers

Times ftimen j n InniteIntegers g

InniteIntegers g Durations fdurationn j n

These are the standard data domains There are two kinds of integers nite and innite ones Innite

integers are provided as a semantic base on which nite integers can b e dened A nite integerisaninteger

with a size The size determines the rounding and overow b ehaviour a size of n implies a representation

as an n bit twoscomplementnumb er Floatingp ointnumb ers can b e dened analogously details can b e

found in

A sp ecial problem is p osed by the existence of numeric literals in an expression like xthe do es not

have a unique typ e This fact is mo delled by representing it as an InfiniteInteger

Typ es Times and Durations are needed to representpoints in time and durations b etween such p oints Both

are isomorphic to the p ositiveintegers a programs execution starts at a time and advances in time in

time steps

Arrays farrays s e e j s MAXSIZE i e Obj g

dim s s  i i

dim

In the compiler innite integers are represented as nite integers with a zero size Thus any op eration that has a literal

as one of its arguments will havethetyp e of the nonliteral arguments as result typ e following the result typ e biggest

argumenttyp e convention

The time steps are supp osed to b e evenly spaced This only matters when the scheduler is implemented ie when a

correlation b etween Times and some physical time is made To the semantic functions time is a blackbox

Chapter The LambdaForm

Arrays may b e of arbitrary dimension There is a maximum size for each dimension Zerowidth dimensions

are not allowed All elements of an array should b e of the same typ e The semantics do es not dene the

physical placement of elements in an array it treats an array as a bag of values accessible via index tuples

Tuples ftupletag v v j tag Tags i v Obj g

i i

A tuple is a record ie a set of values that may b e of dierenttyp es and are accessed bya eldname

which is represented by its p osition when written down as a list All tuples sharing the same tag should

have the same numb er of elements

Tuples are primarily employed in mo delling abstract data typ e constructors As a side b enet they are

used to implementmultiplevalued functions and exceptions Later in the compiler they are also used to

explicitly representenvironments

References freftag j tag Tags g

Tags unsp ecied source of unique tags

References are placeholders that refer to promises p ending callbyneed computations and susp ensions

Susp ensions are held in the global state more ab out the state in section This nonfunctional

b ehaviour is necessary b ecause promises are cached and susp ensions have sideeects so all accesses to them

have to b e co ordinated

Functions Primitives Closures Overloadeds

Primitives fintadd intsuboverloadg

Closures fclosuretag tag env vars types body j tag tag Tags

c l c l

env Env

vars Var list

types Obj list

body Expr g

Overloadeds foverloadedxs j xs Closures list g

There are three typ es of functions primitives closures and overloaded functions While primitives are not

explicitly typ ed closures are Closures are rstorder functions with a lexical scop e There is no sp ecial

ob ject typ e for Pascallike stack functions closures that are not exp orted from their lexical scop e An

overloaded function is just an ordered collection of closures Closures are used to mo del mo dules cf section

so no sp ecial mo dule data typ e exists

Closures consist of a tag whichmakes them unique a tag that identies the Lambda expression of which

c l

they are an instance an expression that denes the b o dy of the function an environment holding the static

bindings of the free variables o ccuring in the b o dy and a list of types that dene the acceptable arguments

Type f Obj Bool Tupletag types Time Duration Intn Floatm n String

Predobj Arraysize size type Proctypes type Closuretag

dim

j n m size size Int type Type types Type list obj Obj tag Tag

dim

g

Types denote sets of semantic values There is a xed set of primitivetyp es which corresp ond to the

semantic value sets presented here Some of the typ es are parameterized Intn describ es the set of all

At the surface ALDiSP supp orts only one and twodimensional arrays The intro duction of a general array concept proved

to b e a simplication

This requirement can lead to problems when numeric values of dierenttyp es are concerned It might not b e sensible to

force the user to write instead of

Chapter The LambdaForm

twoscomplement integers that can b e enco ded in n binary digits Tupletagtypes is the set of all tuples

having the tag tag and elements conforming to types andClosuren is the set of all simple functions with

a tag equivalentto n

The typ e Pred is sp ecial it intro duces arbitrary predicates into the set of typesItissyntactically necessary

since typ es suchasTuple or Proc havesubtyp es whichmay b e userdened

Due to the presence of Predtyp e checking is a nonatomic op eration and typ einference is generally im

p ossible when two given typ es t and t contain Pred terms one cannot even determine whether they are

in a subtyp e relationship Subtyping for atomic typ es is welldened and reects the usual arithmetic typ e

hierarchy

Results

The evaluation of an expression leads to a Result which can b e of two kinds

Result Exception SimpleR esult

Exception fexceptiontag obj state j tag Tags obj Obj state State g

SimpleResult fresultobj state j obj Obj state State g

Their resp ective usage is

exceptiontag obj state is returned by ab orting exceptions It acts as a b ottom value which forces

most semantic functions to ab ort anyany further computation once an exception was generated The

exception thus travels up the return stack to the p oint of the innermost guard that can handle it

Exceptions are identied by their tag

resultobj state represents the normal case where a single value is returned

The internals of State are discussed in section The syntactic domains expressions declarations

programs are denoted by their implementation datatyp e names Lexpr Ldecl Lprog

Evaluation Functions an Overview

The semantics of LF is presented as a set of semantic functions These can b e understo o d as evaluation

functions

Originally there were two additional result typ es mismatch was returned by mistyp ed function applications and

resultsobjs was returned bymultiplevalued functions The removal of these two simplied the semantics signicantly

The rst intermediate representation ie the precursor of the Lambda Form used a continuationpassingstyle Because

this made exception handling easier to formalize the return jump of an ab orting exception was a simple function call This

approachwas abandoned for a numb er of reasons

The resulting co de was bloated and hard to read transformationswere errorprone since each function had to b e

transformed so as to accept and pass along two extra continuation parameters

Iwas not sure whether ecient co de could b e generated for this

At one place in the semantics evaluation of promises and susp ensions it is necessary to implement a catchall for

exceptions and it is not clear how to express this in a CPSevaluator

Lastly the semantics are more natural using the present approach though this is a sub jective observation

Chapter The LambdaForm

eval

expr

The main evaluation function is

eval Lexpr Context Result

expr

The Context is comp osed of three parts

Context Env Env State

static dynamic

There is an imp ortantconvention if a semantic equation contains a context C then references to Env

static

Env andS will refer to the parts of this context Likewise the expression CS means that the

dynamic

state of context C is replaced by S

The static environmentcontains lexical bindings the dynamic environmentcontains exception bindings The

dierence b etween the two is that Env is passed on to a function when it is applied while Env is

dynamic static

encapsulated in the functionsclosure

Environments are mappings from variables to ob jects

Env Var Obj

apply

Function application is the most complex part of the language the large numb er of auxiliary functions that

surround the apply reects this The signature of the core apply is

apply Obj Obj list Context Result

eval

decl

The evaluation of declarations results in twonewenvironments

eval Ldecl Context Env Env Context Result Result

decl static dynamic

The third argumentto eval is an explicit continuation function called with the newly created environ

decl

ments This is necessary so that exceptions o ccurring during the evaluation of the declaration can b e handled

correctly

eval

program

At the program level the evaluation semantics entails additional complexities b ecause side eects and the

progression of susp ensions in time havetobemodelled

While the evaluation of an expression is deterministic with regard to the current state the execution of the

program as a whole may b e nondeterministic Even when all wait times are p oints ie not really intervals

the order of execution of concurrent susp ensions is undened

Chapter The LambdaForm

The semantic function that mo dels the scheduler exhibits this by b eing explicitly nondeterministic

eval Lprog State list

program

Program evaluation consists of two phases First the program instal ls a set of susp ensions Simply evaluating

the program as a declaration and an initial expression results in some return value which is of no further

interest and a changed state containing susp ensions waiting for time to pass andor input to o ccur This

state is then handed over to the scheduler which controls the further execution of the program abstract time

elapses IO transactions are p erformed susp ensions are evaluated and new susp ensions are installed Each

susp ensions evaluation is purely functional in its context but creates a mo died state that may contain

events and new susp ensions

schedule State State list

This is the p oint where nondeterminism o ccurs since p ending susp ensions have no natural order of

execution that could b e used to dene such an ordering

State

At this p oint the concept of state should b e claried Conceptually state describ es

the currenttimeClock

the two sets of nonevaluable and waiting susp ensions Ususps and Wsusps

the set of unevaluated promises Uproms

the set of evaluated references EvRefs

the set of IO events Events

The previous seman tics further contained an IO state comp onent The actual implementationhas

shown that no such thing is needed instead IO can b e mo delled via faked susp ensions called events

a writing IO primitive will create an eventthatcontains the information what and where to write and the

scheduler will execute the output op eration at some p oint of time Likewise input op erations will create

susp ensions that are transformed into EvRefsbythescheduler when data arrives Events can b e treated like

opaque Ususps with a time frame of zero only the scheduler knows when an event might b e p erformed if

it is p erformed it is transformed into an EvRef that holds some unsp ecied value

State RefDefs Clock

RefDefs tag Ususps Blocks Wsusps EvRefs Uproms Events

Ususps fUsuspcond expr t t j cond expr Closures

t t Durations g

Wsusps fWsuspexpr t t j expr Closures

t t Durations g

Blocks fBlocktag f j tag Tags f State Result g

wait wait

EvRefs fEvRefobj j obj Obj g

Uproms fUpromobj j obj Closures g

Events fEventu j u implementation specic event description g

Clock Times

In the semantics of the semantic function returns a set of p ossible state sequences The current semantics do es not do

so since the use of explicit nondeterminism makes the semantics shorter and easier to read Besides the allstate approachis

practically nonimplementable

Chapter The LambdaForm

Ususp terms represent unevaluated susp ensions Wsusp s are susp ensions with an condition that is trueand

EvRefs representevaluated susp ensions or promises Uprom s are unevaluated promises

The Blocks are sp ecialcased Ususp s whenever the interpreter blo cks it generates a Block reference deni

tion that contains the blo cked interpreter as a function from State to Result The second parameter is the

tag that caused the blo ck The blo cked evaluation can continue when the susp ension asso ciated with the

blo cking tag has b een evaluated

The State can b e accessed likeanenvironment S t gets the denition asso ciated with the tag t S t v

intro duces or redenes a denition

Auxiliary Semantic Functions

This section lists some auxiliary functions that will b e employed all over the rest of the semantics

eval

exprs

eval evaluates a list of expressions carrying along the context C and a continuation function K K is

exprs

called with the resulting ob ject list and context when the last expression of the list has b een evaluated

eval C K KC

exprs

eval exprexprsC K

exprs

case eval exprC

expr

of exceptionxoS exceptionxoS

resultobjS if obj

then resultS

else

eval exprsCS

exprs

objsC KobjobjsC

eval takes care of exceptions and values if an exception or a b ottom is encountered it is propagated

exprs

upward in this case the continuation function is never called In the case of this is necessary to

guarantee the strict b ehaviour of the language

deref

ob js

deref takes an ob ject list and delivers an ob ject list of equal length that is guaranteed not to contain

ob js

references Since deref can blo ck it calls a continuation argument with the result instead of simply

ob js

returning it The parameter C holds the context rememb er that the state S is dened implicitly as a

comp onent of the context CsothatStag refers to the value of tag in the context C

deref C K KC

ob js

deref objobjsC K

ob js

case obj

In the rst drafts of the semantics there were no Wsusps but that implied that the schedule had to traverse all Ususps to

lo ok for the evaluable subset Also there is the p ossibility of a once true expression b ecoming false again during the waiting

time without an extra waiting p o ol it would have b een imp ossible to correctly execute such susp ensions Of course correct

here means intuitively without there b eing any prior formal denition

For those readers who have no exp erience with CPSsemantics it should b e noted that the expression denotes or creates

the continuation function It corresp onds to an explicit reference to the stack p osition at which the lo cal variable obj is

stored

Ie the evaluation ab orts and returns the exception value

Chapter The LambdaForm

of reftag

case Stag

of EvRefx deref xobjsC K

ob js

Upromx

case eval xS

thunk

of xS

deref xobjsCStagEvRefx K

ob js

else

blocktagC

S deref objobjsCS K

ob js

else

deref objsC

ob js

objsC KobjobjsC

When deref dereferences an EvRef it calls itself with the dereferenced value This lo op is necessary to

ob js

correctly cop e with nested susp ensions or delays

When deref encounters an unevaluated promise Uprom its value is computed by eval The state

ob js thunk

is up dated ie the Uprom is replaced byan EvRef containing the result value

When deref encounters a reference that p oints to a unevaluated susp ension it blo cks the current pro cess

ob js

by calling block with the currentcontinuation The blo cked continuation captures the current context and

uses it to provide the lexical and dynamical environments needed to continue the evaluation

block

block generates a Block denition that encapsulates the continuation K installs it in the state and returns

a reference to the blo ck

blocktag C K

wait

resultreftag

new

Stag Blocktag K

new wait

strict strict

exprs

Avariation of eval is strict which extends the functionalityofeval by asserting that none

exprs exprs exprs

of the resultant ob jects is a reference It dereferences all EvRefs forces all Uproms and blo cks when there

are susp ended arguments

strict exprsC K

exprs

eval exprsC

exprs

objsC

deref objsC K

ob js

The simple strict takes a result a context and a continuation it calls the continuation with the unpacked

and dereferenced ob ject and state

strict res C K

case res of

resultxS deref xCS xC KxC

ob js else res

Chapter The LambdaForm

eval norm

thunk result

A thunk is a parameterless function Thunks are employed to freeze the evaluation of expressions that are

susp ended or delayed

eval xS

thunk

norm deref xS xC applyxC

result ob js

norm res

result

case res

of resultxS xS

exceptionxoS S

Athunk is exp ected to return a result therefore all exception results are caught and replaced by The

function norm p erforms this normalization it transforms a result into an ob ject and a state The

result

tuple S is a context that contains a state but no variable bindings

Evaluating Expressions

Lambda Closure Creation

A Lambda expression creates a closure with a typ ed argument list The closure represents a function that

when applied to an argument list conrms that each arguementmatches its corresp onding typ e and evalu

ates the function b o dy within the current lexical environment up dated with the argument bindings

The transformation phase creates Lambda expressions when it encounters ALDiSP function denitions cf sec

tion mo dule denitions cf section and abstract data typ e denitions cf section Fur

thermore they are used to implement the thunks in delay and suspend expressions cf sections and

The semantics itself creates new Lambdas on the y as expands automapp ed applications to nested

standard applications cf section

eval Lambdatag v p v p exprC

expr lambda n n

eval p p C

exprs n

t t C

n

resultclosuretag tag E C freevarsexpr

closure lambda static

v v t t exprS

n n

tag is a freshly allo cated tag that uniquely identies each closure generated at runtime

closure

freevarsexpr denotes the set of syntactically free variables in expr The static environmentwhichis

encapsulated in the closure is restricted to this set it is not necessary to put bindings of variables into the

closure that do not o ccur in the co de that it closes over

The argumenttyp es p p are evaluated at closure creation time the resulting typ e ob jects t t

n n

are made part of the closure

LvarVariable Lo okUp

Evaluation of variables is p erformed by lo oking them up in the static or dynamic environments These

environemnts are part of the evaluation context

eval LvarvarC

expr

case E Cvar static

Chapter The LambdaForm

of case E Cvar

dynamic

of errorundefined variable

x resultxS

x resultxS

Only if a variable name is not dened in the lexical environment it is lo oked for in the exception environment

If it is not found there either this is an error Some of these errors can not b e detected statically the

implemented transformation phase lo oks for statically unbound variables that are never b ound dynamically

this check can cheaply b e done as part of the conversion cf section It cannot b e statically

guaranteed that dynamically b ound variables are presentintheenvironment

Lit Literal Values

eval LitxC resultinject xS

expr ob j

The Lit construct directly represents a semantic value Avalue has to b e injected into the Obj

domain this is done by the otherwise unsp ecied inject function While this function lo oks likea

ob j

mere techniciality in the semantics it b ecomes a real op eration in the actual abstract interpreter where the

domain of abstract values may dier greatly from the domains of the semantics

LappFunction Application

Toevaluate a function application the function and all its arguments must rst b e evaluated in some order

This order is rather arbitrary in principle here it sp ecied as lefttoright but any sequential order would

suce Then the application itself is p erformed

apply handles all details of application it conrms that the function can b e applied at all to these

mappable

arguments and calls the real apply if there is an argumenttyp e mismatc h automapping is tried

eval LappexprsC

expr

if exprs

then errorempty application

else eval exprsC

exprs

obj obj C

func args

deref obj C

ob js func

obj C

func

This can b e shown by example

func needsfb fb

func mightcrashx

if complexpredicatex then needsfx

else guard needsfx

on fa a

end

end

If f is not dened in the lexical context or in the dynamic environment whichisactivewhen mightcrash is called a semantic

error will o ccur if complexpredicate is false for x

Such constructs are often called constants which is totally wrong A constant is an expression guaranteed to havethe

same value in all evaluation contexts a literal is the textual representationofa value Of course all literals are constants

except in old FORTRANs but not all constants need b e literals Most programming languages allow only literals as constants

but that should not obscure the terminology

It is not only bad practice to rely on order of evaluation in function applicationsuch practises also make parallelization

nearly imp ossible It would b e a nice frill if the compiler could nd anysuch dep endencies

Macros can b e incorp orated as an evaluation mechanism byrstevaluating the functional p osition and if it turns out to

b e a macro substituting the expressions and evaluating the result The reason why this strategy was not adopted is elab orated in section

Chapter The LambdaForm

apply obj obj C

mappable func args

The ob ject that represents the function is pip ed through deref to ensure that it is not a reference If the

ob js

other ob jects were also dereferenced prior to application the execution b ehaviour would change drastically

it would b e imp ossible to pass p ending susp ensions to functions

Lcheck Asserting Typ es

Lcast and Lcheck are the LF equivalents of ALDiSP typ ecasts and typ echecks

In ALDiSPatypecheck is intended to make an assertion type expr asserts that expr is of typ e type If

this assertion is violated the further execution of the program is undened An interpreter should convert

all Lchecksinto tests a compiler can assume that the typ e given in a Lcheck is correct

If expr denotes a data value checking amounts to simple function application Since typ es are inter

changeable with the predicates that dene them the type can b e applied to the expr if the result is true

everything is ne otherwise there is an error

If expr denotes a function and type a function typ e the situation is dierent It is in general imp ossible

to assert that a given function f has some requested typ e For a numb er of reasons predicatesastyp es

overloading etc ALDiSP do es not supp ort typ e inference It is therefore imp ossible to implementan

Lcheck as a compiletime verication

The only safe way of implementing typ e checks on functions works as follows an expression a bf

Casting ftothetyp e function from a to b is transformed to lambdatmpabf tmpAnew

function of argumenttyp e a is constructed that when called executes fandveries that the result is

of typ e b By using this transformation strategy the argument and result typ e checks are deferred to the

functions application time Furthermore the domain of the function is now guaranteed not to exceed the

type sp ecication

eval Lcheckexpr expr C

expr t e

eval expr expr C

exprs t e

obj obj C

t e

deref obj C

ob js t

obj C

t

case obj

t

of Proca a r check a a robj C

i pro c i e

Obj resultobj S

e

else

strict applyobj obj C C

t e

xC

if x true

then

resultobj S

e

else

errortype mismatch

check a a robj C

pro c i e

resultclosuretag tag v v a a

i i

closurenew lambdanew

Dereferenced application is a p ossible optimization comparable to callbyvalue in a lazy language The strictness analysis

outlined in chapter could b e used to generate annotations to guide sucha behaviour

Most Lchecks are nevertheless eliminated at compile time since manys typ es are veried by the abstract interpreter and

the partial evaluator removes the matching Lchecks asdeadcode

This is imp ortant since a typ e checkcanchange the b ehaviour of a checked value if that value is a function that is later

overloaded and the checked typ e denotes a function of reduced domain

Chapter The LambdaForm

LcheckLitobj LappLitobj Lvarv Lvarv

r e i

S

The typ e has to b e dereferenced since its typ e must b e known The ob j to test is not dereferenced since

this would destroy the semantics of

Obj suspend end

This should not blo ck since any p ossible result is an ob ject

Lcast Casting Typ es

The Lcast form tries to force or coercean expr to a typ e type For example the ALDiSP expression

Realwhich translates to an Lcast should result in the oating p ointnumber In other words

the Lcast form is a general way to sp ecify an appropriate conversion function

The two predened variables coerce and coercebase implement the underlying functionality Both can

b e redened by the user They are given to the Lcast form as third and fourth parameters This explicit

treatment is necessary since the transformation phase reorders all declarations in the program if no explicit

connection b etween the magic cast function variables and the Lcast forms would b e made a declaration

involving an Lcast could b e moved to a p osition where the casting function is not yet dened

While coerce has to b e b ound to an overloaded function that takes one argument the ob ject to co erce

coercebase should b e b ound to a function of two arguments the ob ject and the goal typ e

If Lcast is called with a basic nonuserdened typeasgoalcoercebase is called with the obj and

type

obj as arguments The standard environment has to provide appropriate functions to convert b etween

expr

base typ es Since the set of base typ es is implementationdep endent to provide for applicationsp ecic

numeric or IO typ es the library writer may need this extra exibility

If Lcast is called with a predicate as goal typ e the overloaded coerce function is dismantled into its closures

and each of those in turn is applied to the obj The rst such application that delivers a result that tests

e

true under the typ e is returned as the value of the Lcast The applications are ordered according to the

overload sp ecication ie the most recently dened co erce functions are tried rst

eval Lcastexpr expr e e C

expr t e cf cbf

eval expr expr e e C

exprs t e cf cbf

obj obj obj obj C

t e cf cbf

deref obj obj obj C

ob js t cf cbf

obj obj obj C

t cf cbf

if obj Obj then resultobj S

t e

obj Types then cast obj obj obj C

t Basic base t e cbf

else cast obj obj obj C

general t e cf

cast obj obj obj C

base t e cbf

applyobj obj obj C

cbf t e

cast obj obj obj C

general t e cf

case obj

cf

of overloadedc c

n

let

Exactly this was the problem with an earlier approach in which coerce and coercebase were magic variable names

that were picked out of the static environmentby the semantic function A rst try to repair the semantics bymoving these

variables into the dynamic environmentworked in that it protected them from unwanted conversion and other transformation

rules The nal current mo del of a fourparameter cast simplies things a lot

This rather inecientmechanism is the only way to cop e with the existence of arbitrary predicate typ es

Chapter The LambdaForm

i n

try S normapplyc obj C

i i i e

test checkobj try CS okCok

i t i i

in

if ktest trues j ktest trues

k j

then try

i

else

errorno cast found

The cast function creates two families of indexed temp oraries namely try and test Their order of

general i i

execution is not imp ortant since each application is done in a clean context and only the return state of

the result that is nally chosen is returned

The check auxiliary function applies a predicate to an ob ject and returns a b o olean value It is dened in

section

LcondTwoWay Conditional

Lcond mo dels the standard if op erator It is not automapp ed therefore the ALDiSP expression

if truefalse then else end

is erroneous and do es not evaluate to The rst expression ie the condition must b e available as

a basic value and is therefore passed through strict This will blo c k the whole Lcond if e is a p ending

exprs

susp ension

eval Lconde e e C

expr

strict e C

exprs

o C

case o

of true eval e C

expr

false eval e C

expr

else errornonboolean argument

LselectNWay Conditional

Lselect op erates in a similar manner to the switch of C or the case in Scheme a selection value e is

sel

evaluated the result is taken as index into a list of expressions eachofwhichistaggedbyakey value The

rst matching case is chosen for evaluation if no case matches the default e is taken Lselect is

default

strict in the selection value

eval Lselecte c e c e e C

expr sel n n default

strict e c c C

exprs sel n

o o o C

sel n

if io o jio o

sel i sel j

then

eval e C

expr i

else

eval e C

expr default

This forgetting of state changes can b e very hard to implementinaninterpreter that maintains the state via side eects

Chapter The LambdaForm

Lseq Sequences of Expressions

Lseqexpressions mo del expression sequences with automatic synchronization Expressions are evaluated

from left to right the value of the last expression is the value of the sequence If an expression returns a

reference to an unevaluated susp ension the execution of the rest sequence blo cks

eval LseqexprsC

expr

case exprs

of errorempty sequence

expr eval exprC

expr

exprexprs

strict exprC

exprs

objC

eval Lseqexpr expr C

expr n

In order to avoid an unnecessary auxiliary function the rest of the sequence is packed up in a new Lseq and

evaluated

Lcatch Handling Exceptions

An Lcatch catches exceptions The Lcatch contains one subexpression and a list of exception tags The

subexpression is evaluated if it generates an exception that is listed amongst the tags the exception is

unpacked and its value is returned as an ordinary result Otherwise the value b e it result or unmatched

exception is returned unchanged

eval Lcatchtag tag exprC

expr n

case eval exprC

expr

of resultxS resultxS

exceptiontagobjS

if tag tag tag

n

then

resultobjS

else

exceptiontagobjS

Let Lo cal Denitions

The evaluation of lo cal denitions is mo delled b y the evaluation rules for declarations eval eval

decl decl

evaluates a denition whichmight consist of subdenitions and calls its continuation with twosetsof

bindings one for Env one for Env and a new context That context contains state changes only

static dynamic

the environments are merged with the ro ot environmentwhenthe expr is applied

Like eval eval takesacontinuation argument in order to b e able to handle exceptions when one

exprs decl

of the declarations raises an exception evaluation ab orts

eval LetexprdeclC

expr

eval declC

decl

E E C

s d

eval exprCE E

expr s d

Chapter The LambdaForm

Evaluating Declarations

Ldecl The Simple Declarations

An Ldecl is the atomic declaration An expression is evaluated the result is b ound to a name If the

expression evaluates to an exception or further evaluation is ab orted otherwise the continuation function

K is called with the generated environments and the up dated context The b o olean parameter determines

whether the generated binding shall b e part of the static or the dynamic environment

eval LdeclvarstaticexprCK

decl

eval exprC

exprs

objC

let

env x if x var then obj else

in

if static

then

KenvC

else

KenvC

LpardParallel Declarations

A Lpard is a declaration consisting of a list of declarations to b e evaluated in parallel In parallel means

that the denitions are evaluated in the same environment the declarations are ordered in time so that side

eects can accumulate

eval LparddeclsC K

decl

eval declsCK

pard

eval CE E KKEE C

pard s d s d

eval decldeclsCE E K

pard s d

eval declC

decl

E E C

s d

eval declsCE E E E K

pard s s d d

The indicates that the environments are added not replaced No order is implied a parallel denition

par

x

x

end

creates a lexical environment with an undened value of x either or

Lseqd Sequential Declarations

A Lseqd is a declaration consisting of a list of declarations to b e evaluated sequentially ie each declaration

can access the variables dened in the preceding declarations

eval LseqddeclsC K

decl

eval declsCK

seqd

Such situations cannot o ccur in ac since ALDiSP programs contain only sequential denitions and the transformation rules

would not parallelize denitions of the same variable

Chapter The LambdaForm

eval CE E K KEE C

seqd s d s d

eval decldeclsCE E K

seqd s d

eval declCE E

decl s d

C

eval declsC S K

seqd

One small problem is the accumulation of environments in the context In a sequence of declarations the

visibility of a new declaration starts with the next declaration in the sequence This is mo delled by adding

the new declarations to the context b efore evaluating each new declaration However the continuation must

b e called with the unchanged context so the added bindings havetoberemoved b efore continuing with the

rest sequence This is mo delled by the recursive call with C S ie only the changed state is carried over

Lfixd Recursive Declarations

A Lfixd is a recursive xp oint declaration The subdeclarations are evaluated in their own visibility scop e

eval LfixddeclsC K

decl

let

CE E fix CE E

s d s d

eval declsCE E

pard s d

CE E

s d

CE E

s d

in

KCE E

s d

The new environment is dened to b e the xp oint under the parallel declaration ie the environmentthat

do es not change when the bindings of the declaration are added This implies that it already contains said

bindings and is therefore recursive

The fix op erator can either b e taken as primitive or dened as an iterative approximation pro cess

fix f

let

approxx

let

xfx

in

if x x then x else approxx

in

approx

The actual interpreter may employ a dierent metho d cf section

Application Rules

The main bulk of the apply function comprises the automapping mechanism treated later The core apply is

a dispatch rule that divides the domain of apply into manageable p ortions eachofwhich has quite distinct

b ehaviour

applyfuncargsC

if func Closures then apply funcargsC

closure

Formally fix computes the xp oint of a function of one argument To apply it to a function of n arguments these can b e

treated as tuples with  x x

n

Chapter The LambdaForm

func Overloadeds then apply funcargsC

overloaded

func Primitives then apply funcargsC

primitive

func Arrays then apply funcargsC

array

func Type then apply funcargsC

typ e

else erroris no function

It is imp ortant to note that the core apply is unchecked The test for applicability is isolated in an extra

function test

apply

test funcargsC K

apply

if func Closures then test funcargsC K

applyclosure

func Overloadeds then test funcargsC K

applyoverloaded

func Primitives then test funcargsC K

applyprimitive

func Arrays then test funcargsC K

applyarray

func Type then test funcargsC K

applytyp e

The test functions cannot generate errors they are total They are isolated from the apply b ecause the auto

mapping mechanism has to search for mapping p ossibilities by sp eculatively disassembling data structures

and applying the results to functions By dening an extra test function tree tests can b e done without

causing unnecessary errors In the following test functions are presented next to the apply functions they

guard

apply

closure

A closure is applied by adding the argument bindings to the static environment that is part of the closure

and evaluating the b o dy of the closure in the up dated environment

apply closuretag tag envv v t t expr

closure closure lambda n n

a a C

n

eval exprCE envv av a

expr n n

static

A closure can b e applied to an argument list if all arguments match

test closuretag tag Ev v t t expr

closure lambda n n

applyclosure

a a C K

k

if n k

then

check t a t a C K

list n n

else

KfalseC

check testsCK

list

case tests

of KtrueC

ta rest checktaC

okC

if ok

then

check rest C K

list

else

KfalseC

checkobj obj C K

typ e arg

deref obj C

ob js typ e

obj C

typ e

Chapter The LambdaForm

strict applyobj obj C C

typ e arg

resC

case res

of true KtrueC

false KfalseC

else warningis no type true

The check function will b e used again in the automapping section If one of the parameter typ es cannot

b e applied to its arguments a warning should b e given It mightaswell b e an error This is one of the

situations where an error is signalled much to o late it should have b een given when the closure was created

The call to apply p oses a small problem since even this simple ALDiSP function drives the interpreter into

a lo op

func semanticKillerxsemanticKiller

Another very similar problem stems from the recursive dereferencing

rec

x delayx

end

In the current implementationof ac these problems are considered so absurd that no provisions are made

to catch them as errors

apply

overloaded

An overloaded function is applied to the rst memb er closure that matches the arguments

apply overloadedc c argsC

overloaded k

match overloadedc c argsC

overload k

indexC

apply c argsC

closure index

match nds the rst matching closure and returns its index or if there is none The latter b ehaviour

ov erload

is needed for the test function when apply is called a preceding test has veried that a valid

overload overload

index exists

match overloadedc c argsC K

overload k

find c c C K

rst k

find closuresindexargsC K

rst

case closures

of KC

ccs test cargsC

applyclosure

okC

if ok

then

KindexC

else

find csindexargsC

rst

test funcargsC K

applyoverload

match funcargsC

overload

indexC KindexC

Chapter The LambdaForm

apply

primitive

There are two kinds of primitives those that are internal to the interpreter and may access and change

the state and those that p erform sideeect free op erations on semantic values The latter ones share the

common prop erty that they are strict ie access the values of all their arguments They are of no further

interest to this semantics since strict primitives are atomic and cannot raise exceptions or otherwise

disturb the ow of control

There are only a few nonstrict primitives namely those needed to access and test references to create

tuples and to create arrays

apply pargsC

primitive

if p Primitives

strict

then

deref argsC

ob js

argsC

resultapply pargsS

primitivestrict

else

apply pargsC

primitivenonstrict

apply pargs

primitivestrict

case pargs

of overloadclosurexclosurey overloadedclosurexclosurey

overloadclosurexoverloadedy y

n

overloadedclosurexy y

n

selectindextupletagx x

n

if index Integers index indexn then error

index then tag

x

else x

index

orbool bool bool bool

n n

andbool bool bool bool

n n

tupletypetagtype type Tupletagtype type

n n

functypetype type type

n

Proctype type type

n

istupletag argsclosuretag tag

tuple closure lambda

if tag tag listofconstructors

lambda tuple

then true

else false

istupletag tupletag

tuple tuple

if tag tag

tuple tuple

then true

else false

ioopparameters

resultreftag Stag parameters

io

ioEvent

where

tag a new reference tag

io

intns intms intnmmaxs s

n m n m

many more

else errorprimitive unknown or does not match

apply pargsC

primitivenonstrict

case pargs

of derefreftag case Stag of EvRefx resultxS

Chapter The LambdaForm

else error

isavailreftag case Stag

of EvRefx resulttrueS

Upromx resulttrueS

else resultfalseS

isavailnonReferenceValue resulttrueS

tupletagv v resultTupletagv v S

n n

delayclosurex

resultreftag

delay

Stag closurex

delayUprom

where

tag a new reference tag

delay

suspendclosurexclosureyt t

resultreftag

susp

Stag closurexclosureyt t

suspUsusp

where

tag a new reference tag

susp

mkexctagvalue exceptiontagvalueS

errorprimitive unknown or does not match

Primitives is the implementationdep endent set of primitive function names For all primitives that

strict

are not listed in the semantics b ehaviour is unsp ecied In particular it is not guaranteed that all primitives

are total

The is primitiveprovides typ e testing of constructed values A global list of constructors is created by the

transformation phase cf section It asso ciates each tuple tag with the lamb da tag of the constructor

that was used to create it

The ioop primitiv e creates an IO event The arguments to ioop are implementationdep endent as are

the parameters stored in the Event reference denition

test funcargsC K KtrueC

applyprimitive

Primitives are inherently untyp ed they are not to b e used directly by the programmer It is the resp onsibility

of the libraries to provide typ esecure interfaces to the primitives For example the intadd primitive should

b e employed in a context like

func aIntegerbInteger intaddab

apply

array

ALDiSP has no op erator or sp ecial syntax for array lo okup Instead an array is treated like a function when

it is applied to appropriate indices it delivers the indexed ob ject

apply arraysize size elementsarg arg C

array dim n

deref arg arg C

ob js n

i i C

dim

elementsi i

dim

The internal rowcolumn structure of two or moredimensional arrays is not sp ecied The denition mo dels

wbelooked up Correct typing of the arguments an array as a bag of indexed elements that can someho

integers in the range of the array is guaranteed by the test function

test arraysize size elementsarg arg C

dim n

applyarray

deref arg arg C

ob js n

Chapter The LambdaForm

arg arg

n

dim n i dim arg Int arg arg size

i i i i

An array is applicable to an argument list if all arguments are p ositiveintegers in the range of the array

apply

typ e

The application of basic typ es is hardwired into the semantics One sp ecial case is the typ e Obj whichis

true for all arguments Due to this fact there is no need to dereference the argument if the typ e is Obj This

sp ecial case treatment is more than a mere optimization since all arguments to functions are typ ed and an

ob jects typ e is tested by applying the typ e to the ob ject default dereferencing would make it imp ossible to

write functions that accept p ending susp ensions as their arguments

apply typeobjC

typ e

if type Obj then true

type Predt then checktobjC

else

deref objC

ob js

objC

is objtype

base

is trueBool true

base

is falseBool true

base

is tupletag v v Tupletag t t

base x n y m

tag tag nm checkt v C checkt v C

x y n n

is timexTime true

base

is intxsIntn s n

base

is floatxsmsnFloatmn sm m sn n

base

is String true

base

is arrayArray true

base

is xProctypest true

base

is closuretag Closuretag tag tag

base x y x y

is xPredobjx checkobjxC

base

Application of a typ e is handled by listing the p ossible cases One typ e warrants sp ecial treatment Proc

It is forbidden to apply function typ es since they are undecidable The sole function of them is therefore

to declare a functions typ e not to checkit

test typeargsC K

applytyp e

Kargs C

All typ es are applicable to all singleton argument lists

AutoMapping

The automapping mechanism is the most complex part of the semantics The general idea b ehind auto

mapping is this There exist a numb er of automappable homogenous collection typ es namely dierent

kinds of lists and arrays When a function not dened on these typ es is applied to one of them it is mapp ed

to each of the comp onents A simple example is the application which results in

Though a simple paradigm the concepts can b e extended to more complex examples

The actual implementationemploys slightly dierent means apply generates co de for a lookuparray primitive

array

The current implementation of ac emits a warning when a Proc is applied to an ob ject and approximates it safely for

functiontyped arguments true is returned for nonfunctiontyped arguments false is returned If aBool were returned the

reconstruction phase would try to implement a runtime typ e from the Proc

Chapter The LambdaForm

If one argument is of a nonmapp ed typ e it is extended

Automapping may b e nested

The extension rule still applies

There is one precedence rule Lists map b efore arrays That is

This precedence was chosen b ecause lists of arrays are a more natural data typ e in DSP applications

than arrays of lists

There are a few restrictions lists have to b e of equal length and arrays have to b e of same rank and size

ie adding a threeelement list to a twoelement list is not p ermitted Similarlyvectors cannot b e added to

matrices

The automapping semantics is dened via co de generation A function application is mapp ed by rewriting

it and evaluating the generated LF co de A few examples for the kind of generated co de are

maparrayv v v v

maparrayv v

maplist w maparrayv v w

The existence of primitive functions maparray and maplist is assumed Their implementation is somewhat

machinedep endent since the layout of arrays is not sp ecied in the semantics and will dep end on the target

architecture

apply

mappable

The entry p ointtothe apply complex is apply it is called by eval Lapp

mappable expr

apply checks the arguments and determines whether a simple application is present or not In the

mappable

latter case it attempts to nd a way of mapping the function to its arguments

The test function and all its subfunctions havetotake explicit continuations since they may force the

apply

arguments in the pro cess of testing them ie they can blo ck

apply funcargsC

mappable

test funcargsC

apply

okC

if ok then

applyfuncargsC

else

automapfuncargsC

The automap function needs to insp ect the args in order to recognize mappable typ es hence it rst removes

any references

automapfuncargsC

deref argsC

ob js

argsC

automap argsC list

Chapter The LambdaForm

There are only two basic mappable typ es Lists and arrays This is reected by the fact that there are two

automap functions one for lists and one for arrays Each searches for o ccurrences of its mappable typ es if

it nds any it generates a mapping function with appropriate typ es and tests this function for applicability

If the test succeeds an automapping is found if it fails a recursivecalltoapply is tried there may

mappable

b e another nesting level to unmap if that fails the function application is erroneous

automap funcarg arg C

list n

if i n arg pairhead tail

i i i

ij i j arg List arg List arg arg

i j i j

then

let

loop

loopargargs

let

params varstypesparams loopargs

inner outer

in

if arg pairheadtail

then

Lvarvar params

tmp inner

var vars

tmp

Objtypes

argparams

outer

else

Litargparams

inner

vars

types

params

outer

params varstypesparams loopargs

inner outer

c closuretag tag varstypesLappLitfuncinners

closurenew lambdanew

in

apply map closureparams C

mappable list outer

else

automap funcarg arg C

array n

The transformation lo op traverses the arguments and creates four lists from it the inner arguments the

outer argumentsthetypes and the parameters The outer arguments are those arguments that are to b e

mapp ed there is one parameter for each of the outer variables The inner arguments are made up from the

original arguments with the mapp ed arguments replaced by their corresp onding parameters The typ es are

the parameter typ es of the generated lamb da expression

automap funcarg arg C

array n

if i n arg arraysizes elems

i i i

nif arg arraysizes elems sizes sizes j

j j j i j

then

let

loop

loopargargs

let

params varstypesparams loopargs

inner outer

in

if arg pairheadtail

then

Lvarvar params

tmp inner

Chapter The LambdaForm

var vars

tmp

Objtypes

argparams

outer

else

Litargparams

inner

vars

types

params

outer

params varstypesparams loopargs

inner outer

c closuretag tag varstypesLappLitfuncinners

closurenew lambdanew

in

apply map closureparams C

mappable list outer

else

errorfunction arguments do not match

The automap function works nearly exactly like its counterpart for lists only dimension checking is

array

added

TopLevel Evaluation

The top level evaluation of a program pro ceeds in two steps rst the program is evaluated within an

initial context then the scheduler is applied to the resulting state The scheduler lo ops and generates a

sequence of states as its output terminating when there are no p ossible successor states

eval

program

eval Lprogdeclexpr

program

let

S

C S

res

init

eval declC

decl

E E C

s d

eval exprCE E time

expr s d

resS norm res

result init

in

S scheduleS

S is the initial emptystateItcontains no reference denitions its time is set to an invalid value This

helps to implement a restriction outlined in the informal language denition that is not explicitly mo delled

in the semantics the decl should not change the state ie S S mo dulo nonruntime denitions

Before the expr can b e evaluated a valid initial time is added to the context From that momenton

susp ensions can b e created

schedule

The scheduler is the entity that drives the evaluation once the initial susp ensions have b een created

The scheduler generates a list of states There is some nondeterminism involved this is mo delled in the

probability of Wsuspselection

Chapter The LambdaForm

The timing mo del of the scheduler is a quasistatical one It is assumed that all expressions evaluation in

zero time the scheduler manages a virtual clo ck that is incremented stepwise by advance at certain

time

p oints in the schedule In the generated co de an increment in time corresp onds to a synchronization

ie waiting for that time It cannot b e guaranteed that the real execution timing will adhere to the idealized

timing presupp osed bythescheduler

The idealization of zerotime evaluation is a necessity the alternativewould need some real execution

time mo del Instead the ordering of susp ensions is chosen in suchaway that there is a maximum of slack

between susp ensions of dierent time when more than one susp ension can b e evaluated at a certain p oint

of time the susp ension with the least slack is done rst

The semantic function schedule do es not represent this strategy explicitly it rather gives a nondeterministic

description of all p ossible schedule sequences ie all sequences that adhere to the timing sp ecication of the

user program By weighting the sp ecications with a probabilitythescheduling strategy is implied

scheduleS

if tagStag Ususps Stag Wsusps Stag Blocks

then

S

tagStag Blocktagcont Stag EvRefv

blo cked

then

case norm cont S

result blo cked

of objS SscheduleStagEvRefobj

tagStag Wsuspexprtime t timet t

then with a probability of t

case eval exprS

thunk

of objS SscheduleStagEvRefobj

tagStag Ususpexprcond timet timet

eval exprS trueS

thunk

then

SscheduleStagWsuspexprtimet timet

else

Sadvance S

time

advance S

time

decrement all time values in Wsusp

and increment the system time generating S

perform S

io

perform S

io

tag Stag Eventparameter s Oracleparameters

S StagEvRefValueOfIOparameters

scheduleS

The rst case handles termination which happ ens when there are no susp ensions or blo cks left The second

case lo oks for evaluable blo cked pro cesses and if it nds some cho oses one to evaluate The third case

lo oks for waiting susp ensions that maybeevaluated in the current time frame From the set of p ossible

Wsusps one is chosen with a probabilitythatisinverse to its slack ie the time range in whichevaluation

has to take place If the slack is zero the susp ension will denitely b e evaluated The last case lo oks for

susp ensions that havebecomeevaluable in the curren t time frame it promotes them to Wsusp status If

none of the cases match time is advanced As a sideeect of advance the time ranges of Wsuspsare

time

decremented and some p ending IO op erations are p erformed IO is implementationdep endent this is

The step size is arbitrary it is an artefact of the sp ecication that needs discrete time Each actual implementationwill declare a smallest p ossible time unit

Chapter The LambdaForm

mo delled by an oracle function that cho oses the IO events that are to b e p erformed and by the unsp ecied

function ValueOfIO that determines the value of the EvRef that replaces an Event

Susp ensions b ecome evaluable if their time ranges are shifted into the now p oint or if blo cks are resolved

that were caused by IO events

The ordering of cases intro duces some sequentiality that is not strictly necessaryTheBlock case precedes

the general Wsusp case since blo cking susp ensions are implicitly timed as immediate ie havetobe

executed in any case

By p ermitting IO only on time advances it can b e guaranteed that the set of op erations accessing a single

IO device can b e controlled The natural semantics of IO devices is that they can at most pro duce

or consume one value p er time unit it is therefore an error if more than one value is written to a device

simultaneously likewise all read accesses p erformed at one p oint of time should return the same value

These conditions are not explicitly tested for though

Discussion of Selected Problems

This section shows the problems asso ciated with selected parts of the semantics

Macros

The exact semantics of macros was not mentioned in the previous sections In the currentcombination

of transformationrules and semantic functions macros are eliminated during the transformation from the

abstract ALDiSP program tree to the LF As already mentioned it would b e easy to intro duce a semantic

typ e that denotes macro ob jects as callbyname closures and a primitive mkmacro that transforms a

closure into a macro The evaluation mechanism could b e mo died so that the functional p osition of an

application is evaluated rst In fact such an exp erimental change was made to the reference implementation

of the standard semantics Some annoying problems came up

What is the type of a macroob ject It is not a pro cedure

Can macros b e overloaded

Are macro parameters typ ed

Is a macro closed over its environment Consider the following fragment in whichthe op erator is

redened

macro fab abab

func ab ab

func gx fxx

Now using textual expansion for macros these denitions are equivalentto

func ab ab

func gx xxxx

Under a macroasruntimeob ject implementation of macros which of the will b e applied to the

arguments The macroclosure will provide one denition for and and the closure of g will

provide a dierent denition It seems natural to use the gdenition but then the other horn of the

dilemma presents itself where do es the denition of come from Since a closure environment holds

only bindings for those variables o ccuring free in the closure b o dy no binding for willbefound

A static search for macro applications in the b o dy of a closure is also imp ossible b ecause the macro

mayhave b een given to the function as a parameter The consequence would b e that closures must

Chapter The LambdaForm

save their total creationtime environment Foravariety of reasons not the least b eing the problem

of garbage collection this approach is unsatisfactory

Toobviate these problems all macro o ccurrences are resolved by textual substitution during the transfor

mation phase

AutoMapping

The automapping semantics in the currentsemantics is much simpler than the one employed in earlier

denitions since it is not as general as p ossible It do es not include retrys ie o ccurrences of more than

one mappable typ e amongst the arguments when only one combination is correct An example expression

that demonstrates suchbehaviour is

let

func tvListxInteger lengthv x

in

t

Here automap will generate a mapping

list

maplistw mapvectorv twv

and fail Under a general automapping denition a mapping would b e found

mapvectorv tv

Under the current implementation the example expression is erroneous This is considered a feature Auto

mapping incurs the pragmatic problem that it might nd an interpretation for an application that was never

intended by the programmer If automapping were to go to all lengths to nd a mapping this problem

would growbeyond need

Blo cking

A related problem arises from the combination of automatic dereferencing and blo cking The semantics

is infested with o ccurrences of derefobjs and related functions that ensue the dereferencing of strict

parameters On the other hand there are sp ecial case treatments for situations in which said dereferencing

is to b e avoided typ echecking against Obj for example The semantics would b e much simplied if the

programmer had to call a function deref whenever the value of a susp ension or promises was needed

The real problem b ehind the dereferencing is the blo cking b ehaviour that follows an access to an unevaluated

susp ension For this to work an explicit continuation of the currentinterpreter state has to b e created and

reied Reication is the pro cess of lifting semanticslevel ob jects into the ob ject language It creates a

serious followup problem b ecause there is no way to lo ok into the reied continuation function it not

p ossible to compare twosuchcontinuation functions and they cannot b e traversed by a garbage collector

The consequences of this problem are are exp ounded in section

Earlier semantics did not use a reication mechanism but created suitable LF expressions that represented

the continuation expression While that approachsolved the comparability and GC problems it blew up

the size and complexity of the semantics and was hard to debug

Each closure would then bind a large numb er of junk items which the garbage collector could not disp ose of

This might b e called the TECO problem when nearly anystringofcharacters is a valid command sequence the compiler

will not catchmany errors

Chapter The LambdaForm

Implementing Recursion

The denition of the recursion primitive fix as given b e the semantics is not very ecient There exists a

cheap alterative that employs the existence of state

eval LfixddeclsC K

decl

let

v v vars decls

n dened

t t fresh tags

n

C C int EvRef

i

inv Reft

i i

in

eval declsC

pard

CE E

s d

KC int EvRefE E v

i s d i

E E

s d

Eachvariable dened in the recursivecontext is given a reference as its denition The reference is initially

dened to b e if there is an access to it during the evaluation of the declarations this is erroneous Then

all declarations are evaluated in parallel The resulting values are then stored in the references as the nal

values This metho d has two disadvantages Each successive access to one of the recursivevalues is indirected

via the reference this costs time A bigger problem is the increased size of the state the state comparison

that is an exp ensive part of the abstract scheduler might b e slowed down

Chapter

Transformation of the AST into Lambda Form

This chapter shows how the abstract syntax tree AST that is generated by the parser is transformed into

an LF program This transformation provides for an indirect semantics of ALDiSP since LF programs have

a dened meaning and the transformation rules are total and unique

The transformation is partitioned into one one ma jor transformation function that generates a rough trans

lation followed by a sequence of small lo cal transformation rules that p olish the LF program

The three main transformation functions T T T and T are of typ e

program decl expr typ e

T Aprogram Lprog

program

T Adecl Ldecl

decl

T Aexpr Lexpr

expr

T Atype Lexpr

typ e

The transformations are sp ecied to b e contextinsensitive they translate AST expressions generated by

the parser cf gure into equivalent LF expressions without referring to their lexical context or other

compile state This approachwas chosen for simplicity but incurs some ineciencies These ineciencies

are removed by the p ostpro cessing phases

These transformation functions do not take as parameters any kind of environment continuation expressions

or other compile state Since some amount of global data must b e collected eg the tags that identify

abstract data typ e constructors a global data base is used to store such information

The most imp ortant p ostpro cessing step is macro expansion It is p erformed by a LF tree tra versal that

collects all macro denitions and removes them from the LF program then all macro o ccurrences are

substituted To minimize scoping problems this happ ens after conversion In the LF program generated

from T allALDiSP macro declarations havebeenaggedwithamkmacro primitive each denition

program

of the form

macro axy

has b een transformed into

a mkmacrolambdaxy

This intermediate representation was chosen to simplify p ostpro cessing which has to nd all macro deni

tions and expand all macro applications

This do es not violate the comp ositional structure of the transformation since these global information are all either sources

of unique names or singledenition asso ciativearrays

Chapter Transformation of the AST into LambdaForm

Most p ostpro cessing transformations are semanticpreserving LFtoLFfunctions Some of them deadco de

removal conversion detecting unnecessary casts and checks are lo cal cleanup optimizations that can

also b e applied later in the compiler others lamb da hoisting macro substitution are more exp ensive global

optimizations that need to b e p erformed only once

Transforming Programs T

program

The toplevel transformation rule is T

program

T Aprogramd d d d expr

program s sn r rm

LprogLseqdT d T d

decl s decl sn

LetT expr

expr

LfixdT d T d

decl r decl rm

The two declaration lists have to b e treated dierently the rst is to b e interpreted sequentially the second

one representing the net declarations is recursive The distinction is made explicit by encapsulating

the declarations within Lseqd and Lfixd declarations resp ectively Since recursive denitions are more

exp ensive at the implementation level all Lfixd denitions are sequentialized in a p ostpro cessing step

Transforming Declarations T

decl

In ALDiSPeverything of interest is a declaration The Lambda Form needs only one basic declaration form

plus three declaration combinators sequential parallel and recursive combination The T functions

decl

have to decomp ose the sometimes baro que ALDiSP declaration syntax into these four declaration forms plus

lambda and typ echecking no des At the ALDiSP level lexical and exception bindings are distinct forms at

the LF level a b o olean ag distinguishes b etween them Since only two ALDiSP forms intro duce exceptions

guard bindings and parameter declaration that are qualied with an explicit exception keyword most

transformation rules set the rst parameter of Ldecl forms to true

Asimpledecl

There are basically two cases of simple declarations overloaded and nonoverloaded To implementover

loading a primitive function overload is employed overload takes two arguments that must b e closures

or overloaded functions and creates a new overloaded function from them

T Asimpledecloverloadedvarexpr

decl

if overloaded

then

LdecltruevarLappLitoverload

T expr

expr

Lvarvar

else

LdecltruevarT expr

expr

Aparamdecl

Declarations with parameters are hard to transform esp ecially in the context of recs

Chapter Transformation of the AST into LambdaForm

At the AST level the concept of parametric declaration is somewhat confused each declaration that is

preceded byatyp e keyword func proc type exceptionormacroisparsedasanAparamdecl There is

no requirement for the declaration to have a parameter list

An example of such a degenerated declaration is

func add

The typ e keyword forces a typ e check the declaration shown ab ove is equivalentto

add Function

A second problem area is recursion At the ALDiSP level a sp ecial rec end construct marks mutually

recursive denitions rec is only needed for mutually recursive functions since all functions are supp osed

to b e recursive in themselves This do es not have to b e handled bythe T the parser has already

decl

encapsulated simple nondegenerate Aparamdeclsinto a recs

A p ostpro cessing stage is necessary to rename old denitions of overloaded variables Consider the case

of a simple overloaded recursive denition A denition of the form

overloaded func f f

has to b e transformed into a cluster of denitions

Lseqd f f

old

Lfixd

f overloadf f

old

so that the old name is not shadowed by the recursive denition This transformation cannot b e done at the

Aparamdecl level since it dep ends on the context the enclosing Lfixd can b e many nesting levels removed

The transformation rule for Aparamdecl navely applies the overload primitive if the ag is set and oth

erwise cares only ab out transforming the parameter lists into nested Lambda forms The rule is somewhat

complicated by the fact that ALDiSP syntax allows for multiple argument lists a Aparamdecl therefore has

as its primary argument a list of parameter lists The auxiliary function T creates one nested Lambda

params

expression for each parameter list

T Aparamdecloverloadedheaderparams params varexpr

decl n

let

statictypemacro

case header

of Afunc true Type false

Function

Aproc true Type false

Pro cedure

Atype true Type false

Typ e

Aexception falseType false

Ob j

Amacro true Type true

Ob j

expr T params params expr

params n

expr LcastLittypeexprLvarcastgeneralLvarcastbase

expr if macro

then

LappLitmkmacroexpr

else

expr

in

Ldeclstaticvar if overloaded

Chapter Transformation of the AST into LambdaForm

then

LappLitoverload

expr

Lvarid

else

expr

T expr expr

params

T Aparamp t Aparamp t paramsexpr

params n n

Lambdatag p T t p T t

new typ e n typ e n

T paramsexpr

params

The list of header cases lters out the typ e of the declared ob ject whether it is a macro or not and whether

it is to b e b ound dynamically or lexically

The typ e is used to generate an Lcheck that encapsulates the expression If the expression is a macro a

primitive macro is applied to the typ echecked expression The typ es Type Type and Type

Function Pro cedure Typ e

are not mentioned explicitly in the previous chapter it is assumed that they are present as primitive functions

The typ e Type is the same as Obj

Ob j

Amultidecl

Multiplevalued declarations are mo delled via tuples that are explicitly dismemb ered a declaration of the

form

v v expr

n

is translated into LF co de of the form

var expr

tmp

v selectvar

tmp

v selectnvar

n tmp

One new temp orary variable var isintro duced for each Amultidecl binding

tmp

T Amultidecloverloadedvar var expr

decl n

let

i n

select LappLitselectLitiLvarvar

i tmp

decl Ldecltruevar

i i

if overloaded

then

LappLitoverloadselect Lvarvar

i i

else

select

i

in

LseqdLdecltruevar T expr

tmp expr

decl decl

n

The selectn t primitive extracts the nth element from a tuple t select is the expression that extracts

i

is the declaration for var the ith element from expr and decl

i i

Chapter Transformation of the AST into LambdaForm

Amoduledecl

Mo dules are implemented as functions that enclose a name space The encapsulated values are accessed via

an Lselect expression that uses string representation of the exp orted names to nd the matching ob ject

The export Aexport phrase is used to decomp ose the ith exp ort clause

i

T Amoduledeclvar export export decl decl

decl mo dule k n

let

i k

export Aexportold new type

i i i i

expr

select

LselectLvarvar

new

Litstringnew LcheckT type Lvarold

typ e

Litstringnew LcheckT type Lvarold

k typ e k k

Lit

in

Ldecltruevar

mo dule

Lambdatag var LitString

new new

expr

select

LetLseqdT decl T decl

decl decl n

The list of exp orted names is transformed into a list of alternatives of an Lselect An exp ort declaration

Aexportold new typed exp orts the ob ject declared originally as old under the new name new and imp oses

the type on it If no value is matched is returned

The function string converts a variable name into its string representation

Aimportdecl

An ob ject is extracted from a mo dule by application of its name string to the mo dule function ie

import A as B from X end

is translated into co de that corresp onds to

B XA

An Aimportdecl is transformed into a sequence of declarations The only complexity is the added typ e

checking and the ubiquitous handling of overloading

The import Aimport phrase is used to decomp ose the ith imp ort clause

i

T Aimportdeclvar import import

decl mo dule n

let

i n

import Aimportoverloaded old new type

i i i i i

access LcheckT type

i typ e i

LappLvarvar

mo dule

Litstringold

i

An alternative metho d would b e the use of some textual means to distinguish mo dule scop es Such a metho d is considered

for the next implementation since it is exp ected that it lowers the costs and simplies some analyses

Chapter Transformation of the AST into LambdaForm

access if overloaded

i i

then

LappLitoverloadaccess Lvarnew

i i

else

access

i

decl Ldecltruenew access

i i i

in

Lparddecl decl

n

The clauses of an imp ort declaration are group ed into a set of parallel declarations If the declarations were

sequentialized no warning could b e given if a name is multiply dened

Aabstypedecl

Each datatyp e declaration is translated into a set of functions to create tagged tuples to access their contents

and to check their typ es A declaration like

abstype BinTree Leaf ValuedLeafvalue

NodeleftBinTreerightBinTree

intro duces a set of names and ob jects

constructors Leaf ValuedLeaf and Node

selectors value left and right

atyp e BinTree implemented as a predicate

In addition the range of the is predicate has to b e extended to recognize the data ob jects created by the

constructors as in

func countLeafsxBinTree

if x is Leaf then

x is ValuedLeaf then

else countLeafsleftxcountLeafsrightx

end

This is handled via a global list of constructors that asso ciates constructor functions with tuple tags To make

this comparison secure it b ecomes necessary to uniquely identify the constructors While it is p ossible to

identify constructors by comparing their b o dies this is b oth inecient and errorprone since later phases

of the compiler change the denitions of functions Instead the constructors are identied with their tags

indeed this is the very reason for the intro duction of these tags in the rst place

Parameters

A translation problem comes when constructor parameters typ ed The abstype declaration can takearbi

trary parameters as in

abstype BinTreeOfX LeafvalueX

NodeleftBinTreeXrightBinTreeX

type IntTree BinTreeOfInteger

type BoolTree BinTreeOfBool

Once lamb da and closuretags are intro duced they prove to b e convenientina numb er of places where ob jects havetobe

sorted compared and cached

Chapter Transformation of the AST into LambdaForm

How are constructors and typ e tests to b ehave Imagine a context where the ab ovegiven declarations are

visible and the following constructors applications are executed

x Leaf

y Leaftrue

z Leaf error

There are two issues here

is the third applications an error There is no instantiated typ e corresp onding to the constructor

On the other hand the typ e BinTreeOfFloat can b e created somewhere else ie later in the

program

Should the typ e of x b e dierent from the typ e of y That is should the Leaf constructor b e

sensitive to the typ e of its argument and incorp orate them into the tuple tag

If this is not done a typ e test like IntTreex will havetotraverse the entire tree to lo ok at the typ es

of all no des

On the other hand the Leaf constructor cannot know all p ossible typ es it will applied to later in the

program This is esp ecially true when predicate typ es o ccur

In the currentversion of the transformation rules the following strategy is employed

If a constructor takes an argument that is restricted to b e of typ e expr and expr contains one of

T T

the parameters to the typ e declaration no typ e restriction is built into the constructor

For each constructor a test function is created that is parameterized by an argument list identical to

that of the abstype This test function will checkthetyp es of all constructor arguments that were

dismissed due to the previous rule

An Example Transformation

What follows is an example translation in which all problem cases that might o ccur in am abstype decla

ration are exemplied

abstype TestTypeP ALeafp fooP

BLeafq intp

The declarations generated are in pseudoco de

ALeaf pObj tupletag p

A

BLeaf qintpObj tupletag qp

B

p xTupletag Obj selectx

A

q xTupletag ObjObj selectx

B

p overloadxTupletag ObjObj selectx

B

p

isALeafPx x is ALeaf fooPpx

isBLeafPx x is BLeaf

TestTypePx isALeafPx isBLeafPx

The is function will transform into a simple tag check eg

x is ALeaf

translates into

Tupletag Objx

A

Since the constructor parameter of is do es not havetobeknown at translation time a global list is

maintained that asso ciates constructor tags with their tuple typ es Tupletagt t is the typ e of all

n

ntuples with tag tag and comp onenttyp es tt n

Chapter Transformation of the AST into LambdaForm

The Rules

T Aabstypedeclparams params var constructor constructor

decl p typ e n

let

vars A v t A v t v v

params params params n n n

vars vars params vars params

params params i

i n

constructor Aconstructorid p t p t

i i k k

k

type T t

j typ e j

type if freetype vars

j

intrinsicj

then

LitObj

else

type

j

type if freetype vars

j

parametricj

then

type

j

else

LitObj

sel Lambdatag var expr

j new sel

typ etuple

LappLitselectLitjLvarvar

sel

decl Ldecltruep sel

j j

selj

expr LappLittuple Littag type type

typ e i

typ etuple intrinsic intrinsicn

decl Lparddecl decl

selsi sel selk

cons Lambdatag p type p type

i k

consi intrinsic intrinsick

LappLittupleLittag Lvarp Lvarpl

i

decl Ldecltrueid cons

i i

consi

test LappLappLitmktupletypeLittag

i i

test test

parametric parametrick

Lvarvar

test

side effect add tag tag to the listofconstructors

i

consi

test T params params

typ e params p

Lambdatag var LitObj

new test

LappLitor test test

n

in

LseqdLparddecl decl

cons consn

decl decl

sels selsn

Ldecltruevar test

typ e typ e

The T transformation is dened in the context of the Aparamdecl transformation The primitive or is

params

the logic or function extended to an arbitrary numb er of arguments

The list of params is the aforementioned global structure that implements the is primitive

Arecdecl

Recursive declarations can b e turned directly into Lfixd forms

T Arecdecldecl decl

decl n

LfixdT decl T decl

decl decl n

Chapter Transformation of the AST into LambdaForm

In a p ostpro cessing step recursive declarations are simplied further cf section

Transforming Typ e Expressions T

typ e

Transforming a typ e results in an expression denoting a predicate There are two kinds of typ e expressions

function typ es and everything else ALDiSP do es not supp ort sp ecial syntax to distinguish b etween typ e

expressions and value expressions the Aexprtype is therefore needed to lift ordinary expressions into the

syntactic typ e domain

Afunctype

This sp ecial expression mo dels the sp ecial syntactic form t t t t that expresses the typ es of

functions The transformation generates a call to the primitive mkproctype that creates a Proc typ e

ob ject

T Afunctypea a r r

typ e n m

LappLitmkproctypeT a T a T r r

typ e typ e n typ es n

Since there are no multipleva lued functions at the LF level a tuple typ e has to b e created for m

T x T x

typ es typ e

T x x

typ es n

LappLittupletypetag T x T x

multi typ e typ e n

A sp ecial tag tag is reserved for multireturn values

multi

Aexprtype

An expression typ e is an ordinary expression

T Aexprtypeexpr T expr

typ e expr

No check is made the expr really denotes a typ e value or a predicate

Transforming Expressions T

expr

The next few transformations mo del onetoone corresp ondences b etween ALDiSP and LFexpressions

Avar

T Avarvar Lvarvar

expr

No distinction is made b etween dynamic and lexical variable lo okup at either the ALDiSP or the LF level

Aapp Acond Acheck

T Aappexpr expr

expr n

LappT expr T expr

expr expr n

T Acondexpr expr expr

expr

LcondT expr T expr T expr

expr expr expr

T Achecktypeexpr

expr

LcheckT typeT expr

typ e expr

Chapter Transformation of the AST into LambdaForm

Acast

T Acasttypeexpr

expr

LcastT typeT exprLvarcastgeneralLvarcastbase

typ e expr

The cast expression has two invisible parameters castgeneral and castbase These variables dene

overloaded functions that implement the cast functionality They are provided by a system library and can

b e mo died by the user The cast only serves as a generalized dispatch so that the programmer do es not

have to rememb er the names of dozens of conversion functions

Adelay

T Adelayexpr

expr

LappLitdelayLambdatag T expr

new expr

The primitive function delay transforms a thunk into a Ususp A new parameterless lamb da expression

is generated to encapsule the delayed expression

Asuspend

T Asuspendexprcondt t

expr

LappLitsuspendLambdatag T expr

new expr

Lambdatag T expr

new expr

T tT t

expr expr

This works analogously to the delay transformation The two tags are distinct Both the susp ended

expression and the conditional expression have to b e converted into thunks There is no syntactic check for

the typ es of t and t they should b e of typ e Duration such tests are deferred to run time

Atuple

T Atupleexpr expr

expr n

LappLittupleLittag T expr T expr

multi expr expr n

Here the tag mentioned in section resurfaces

multi

Aseq

T Aseqstmts T stmts

expr stmts

ALDiSP sequences can contain b oth declarations and statements Statements ie expressions are synchro

nized if a statementevaluates to a susp ension further evaluation blo cks until this susp ension is available

Sequences are translate into nested declarations and LF sequences

T errorempty sequence

stmts

T Adecldecl stmts

stmts

LetT decl

decl

Traditionally a function that takes no arguments and is used to delay the evaluation of an expression is called a thunk

The origin of the term is obscure it was p ossibly coined when thunks were rst used to implement callbyname in ALGOL

cf p

Chapter Transformation of the AST into LambdaForm

T stmts

stmts

T Aexprexpr T expr

stmts expr

T Aexprexprstmts LseqT expr

stmts expr

T stmts

stmts

An alternative translation mightdoaway with LFlevel sequences and sequentialize the statements using

explicit susp ensions This was not done in ac b ecause there are situations in whichitisconvenientto

generate LF sequences In eect this transformation is nowdoneby the semantics

Astring Aint Afloat Aconst

T Astringx Litx

expr

T Aintx Litx

expr

T Afloatx Litx

expr

T Aconstx Litlookup x

expr const

These transformation rules do not convey the exact transformation The stringintoat values are injected

into the Obj domain Aconst values which are symb ols just likevariables are lo oked up in a table of

literals This table contains suchvalues as Obj and true ie the names of basic ob jects that are not

b est describ ed as primitives

If a constant is not found in the table it is considered to b e the name of a primitive The set of primi

tive op erations is not xed primitives are only lo oked up b y the semantic function that implements their

application b ehaviour

Aguard

A Aguard expression has two purp oses it intro duces a set of dynamic bindings and it catches exceptions

The bindings are mo delled using normal declaration transformations The Lcatch form mo dels exception

handling

To construct the Lcatch expression the list of exception tags that can b e caught is needed This list

is extracted from the Aguard expression by searching the exception declarations for o ccurrences of the

application sequence generated by the return statementmkexctag

T Aguarddecl decl expr

expr n

let

in

d T decl

i decl i

tags collectd d

n

in

LetLpardd d

n

LcatchtagsT expr

expr

collect

collectdecldecls

if LappLitmkexcLittag decl

then

tag collectdecls

else collectdecls

Chapter Transformation of the AST into LambdaForm

Areturn

The Areturn ab orts the currentevaluation and raises an exception The exception can b e parameterized with

arbitrary data values In the semantics exceptions are a sp ecial result typ e the LF co de that implements

the Areturn therefore only constructs suchavalue No control ow manipulation is needed

T Areturnexpr expr

expr n

LappLitmkexcLittag T expr T expr

new expr expr n

The primitive mkexc constructs an exception value

Alocal

A lo cal declaration in ALDiSP has exactly the same semantics as a Let expression in LF The declarations

are implicitly ordered sequentially

T Alocalexprdecl decl

expr n

LetT expr

expr

LseqdT decl T decl

decl decl n

Postpro cessing Passes

After T has b een applied to the abstract syntax tree a numb er of p ostpro cessing passes make small

program

changes to the LF program Some of these passes are needed to x problems that were not treated correctly by

the transformation rules others are classical compiler transformations that remov e no des from the program

tree or rearranges them to minimize execution characteristics such as time and space consumption Not all

of these optimizations are implemented in the currentversion of the compiler

There is no real need for optimizations at this p oint of the compilation tra jectory as the partial evaluator

can work on the unoptimized programs and should generate functionally equivalent co de but there are

some practical reasons for including them

Since abstract interpretation is muchslower than standard interpretation it is advantageous if the

program tree is pruned byremoving all excess op erations as early as p ossible

Debugging the later stages of the compiler is much simpler if the programs are small

Implementation of p ostpro cessing steps

A general treewalker has b een written that collects the lexical environments for all variables while walking

the tree the p ostpro cessing transformations are instantiations of this walker A transformation rule is

applied at eachmatching no de in b ottomup order the current static environment is given as a parameter

In the environment eachvariable is either dened in terms of some expression or marked as b ound in a

Lambda parameter list

Expanding Macros

After the main transformation rules have b een applied to the abstract syntax tree all macro declarations are

collected and removed from the programs and the macros are replaced by textual substitution everywhere

No attention is paid to name clashes ie the fragment

Chapter Transformation of the AST into LambdaForm

gg

macro fab let tmp gadelay b in hgab end

gg

tmp

fxtmp

will b e transformed into

gg

gg

tmp

let tmp gxdelay tmp in hgxtmp end

with all problems involved ALDiSP macros are not meanttoprovide hygienic macro expansion

The expansion function can b e describ ed by the lo cal transformation function

transform nodeenv

macro

case node of

LappLvarxarg arg

n

if envx LappLitmkmacroLambdatagp t p t body

m m

then

retag bodyp p

lamb das n

arg narg

else

node

else

node

One nasty problem is solved bytheretag function if the body contains Lambda expressions and the

lambdas

macro is instantiated more than once the nal program will contain multiple Lambda expressions with the

same tag retag walks a tree and gives a new tag to each Lambda expression it encounters

lamb das

Declaration Simplication

The Lambda Form has three forms of comp ound declarations parallel sequential and recursive From

these building blo cks deeply nested declaration structures can b e built The transformation rules generate

many such groups that contain only one declaration the macroexpansion even generates empty groups

to remove a macro declaration by a lo cal transformation it is replaced byanempty declaration group

One pass therefore compacts declarations the transformation rule is as follows

simplify decl

declaration

case decl of

Ldeclsve x

Lparddecls simplify traverse decls

pard pard

Lseqddecls simplify traverse decls

seqd seqd

Lfixddecls simplify traverse decls

xd xd

The three traverse functions merge nested declarations of the same typ e

traverse done todo

pard

case todo of

done

decltodo

This approach could cause real problems if there are references to sp ecic tags but luckily these o ccur only in abstype

declarations Putting such a declaration into a macro do es not makemuch sense since this would intro duce constructors with

multiple denitions These would not haveawelldened semantics anyway

Chapter Transformation of the AST into LambdaForm

case decl

of Lparddecls traverse done decls todo

pard

Lfixd traverse done todo

pard

Lseqd traverse done todo

pard

else traverse decl done todo

pard

traverse done todo

seqd

case todo of

done

decltodo

case decl

of Lseqddecls traverse done decls todo

seqd

Lfixd traverse done todo

seqd

Lpard traverse done todo

seqd

else traverse donedecl todo

seqd

traverse done todo

xd

case todo of

done

decltodo

case decl

of Lfixddecls traverse done decls todo

xd

Lpard traverse done todo

xd

Lseqd traverse done todo

xd

else traverse donedecl todo

xd

simplify views the decls as a set of declarations it tries to separate it into two indep endent subsets If

xd

they are found a parallel declaration consisting of two recursive declarations is created

simplify decls

xd

if decls decls decls decls decls decls decls

defdecls usedecls

usedecls defdecls

then

Lpardsimplify decls simplify decls

xd xd

else

if decls then decl

else Lfixddecls

simplify lo oks for declarations that can b e lifted from the sequential contexts b ecause they do not

seqd

refer to previous denitions or provide denitions needed by the following denitions

simplify decl decl

seqd n

if ik ikn

defdecl decl usedecl decl

i  i k

defdecl decl usedecl decl

i k k n

then

Lpardsimplify decl decl decl decl

seqd i  k n

simplify decl decl

seqd i k

else

if n then decl

else Lseqddecl decl

n

simplify replaces oneelement Lrecd no des by their contents

recd

fun simplify decls

pard case decls

Chapter Transformation of the AST into LambdaForm

of x x

else Lparddecls

Removing unnecessary Checks and Casts

During the transformation phase some Lcheck and Lcast op erations were intro duced that have Obj as their

typ e They can b e removed safely

remove expr

cast

case expr

of LcastLitObjxc c x

general base

LcheckLitObjx x

else expr

Conversion

Conversion is the pro cess of giving each statically b ound variable a unique name Once this prop ertyhas

b een achieveditismuch easier to rearrange parts of the program since name clashes are now imp ossible

Implementing the conversion is trivially achieved by an LF syntax tree traversal that keeps track of the

lexical environmentandintro duces appropriate renamings As a side eect lexically unbound variables are

found and appropropriate warnings issued when no matching exception denition exists

Problems with Rearranging Expressions

It would b e nice if prepro cessing steps would b e allowed to rearrange function applications eg to hoist them

out of lo ops to merge common sub expressions and to remove dead co de But such transformations are not

secure since in ALDiSP function application mayhave side eects it would certainly b e deleterious to a

programs semantics if an application of write would b e optimized away or hoisted from a lo op b ecause the

result was ignored Likewise rearranging the order of sideeects has to b e avoided

Therefore only those changes to the program are allowed that are guaranteed not to change any sideeects

Amongst the allowed transformations are

moving Lit expressions

moving Lambda expressions they only capture parts of the lexical environment and can otherwise b e

treated as literals ie no action is involved in their evaluation

application of known sideeectfree primitives to arguments that are guaranteed not to b e susp ended

Every other application mayinvoke the creation of a direct side eect or a Block and

moving declarations that contain only sideeectsecure expressions

Lamb da Hoisting

This optimization tries to moveeach Lambda expressions up to the p oint where it is nearest to the denition

of its free variables

One place where this optimization helps is mo dule declarations since the declarations do not dep end up on

the selector variable they can b e hoisted to the global level More opp ortunities for lamb da hoisting will b e

mentioned in chapter

Chapter Transformation of the AST into LambdaForm

The implementation of lamb da hoisting is much simplied by the preceding conversion

hoist expr

case expr

of Lambdatagparamsexpr

let

lambdas collectlambdaexpressionsexpr

defs collectLdeclsexpr

vars params paramslambdas

ro ot

needs vars d d defs freed vars

vars

defs hull needs vars

need transitive vars ro ot

defs defsdefs

hoist need

in

LetLambdatagparamsremove exprdefs

defs hoist

Lfixddefs

hoist

else expr

This transformation is lo cal to each Lambda expression First all o ccurrences of Ldecl no des are extracted

from the b o dy of the Lambda Then the set of all declarations that refer to the Lambdas parameters is

enumerated The hull function extends the needs function so that b oth direct and indirect variable

transitive

dep endencies will b e found The ro ot of the dep endency search consists of the variables dened in the current

Lambda no de and those dened byallLambda no des found within the expression

Finally all declarations that are found to b e indep endent from the parameters can b e hoisted The hoisting

itself is implemented byintro ducing a new Let no de that contains a recursive subdeclaration that surrounds

the hoisted declarations The recursivityisintro duced since it is not known whether the declarations come

from a recursive context or not due to the conversion the Lfixd will not clobb er names

Further simplication transformations will remove unnecessary Lfixd decls and empty Let no des which

o ccur when no hoistable declarations are found

Transforming Free Variables into Parameters

If a function a has a free variable x and each call site of a is statically known x can b e passed as an

additional parameter to a This is p ossible since all call sites of a are only known when it do es not escap e

the lo cal scop e which implies that x is visible wherever a is visible For example in

func bxy

let

func loopz x loopz

in

loopy

end

the variable loop cannot escap e since it is not returned as a result Therefore the fragment can b e rewritten

as

func bxy

let

func loopxz x loopxz

in

loopxy

end

This transformation which mightormight not b e an optimization is not implemented in the current compiler

Chapter Transformation of the AST into LambdaForm

Strictness Analysis

Finally there is a very ALDiSPsp ecic use of strictness analysis SA SA tries to determine for each

parameter of a function if the parameter is accessed in all execution paths of the function If this is the

case the function is said to b e strict in this parameter

SA is mostly employed in lazy languages to nd applications that can b e called byvalue without risking

nontermination In ALDiSP the goal of SA is to avoid blo cking If a function is strict in all its arguments

any susp ended argumentwillblock the function eventually Consider

func fabc abc

fsuspend end

According to the semantics a chain of blo cks will b e created

block c

block bblock

block block

block ablock

Strictness analysis would infer that f is strict in all its arguments and would representthisbyintro ducing

typ es

func faStrictObjbStrictObjcStrictObj abc

Any application of f with a reference argumentwould then directly force the blo ck

block fsuspend end

In typical DSP applications nearly all functions are strict

By avoiding blo cks the state space created by the abstract scheduler is minimized in advance which reduces

b oth the run time and the memory consumption of the compiler

Due to correctness problems this optimization is not implemen ted in the current compiler even if a function

is strict in an argument it mighthave side eects b efore that argument is accessed

Chapter

Partial Evaluation Denitions and Known Results

Before lo oking into the implementation of the combined abstract interpreter and partial evaluator AIPE

that is incorp orated in the ac compiler the theoretical underpinnings of PE shall b e sketched out A go o d

summary of the state of the art can b e found in the pro ceedings of the rst workshop on partial

evaluation and mixed computation Recent literature can b e found in the pro ceedings of the Partial

Evaluation and SemanticsBasedProgram Manipulation workshops

The rst comprehensive textb o ok on PE is It is mostly concerned with selfapplicable oline partial

evaluators These consist of extensive binding time analyses that annotate the program text with dy

namicstatic tags These tags guide the following text rewriting phase that inlines and sp ecializes function

denitions

Anumb er of denitions presented in the following pages are quoted from the terminology chapter of

Denitions

In the following the semantics of a language L is considered a function

L program input output

L program program

where program is the set of Lprograms A language is therefore identied with its interpreter The

L

semantics of any program is given by its functional b ehaviour the semantics of p p program can b e

L

derived by the application Lp

Partial Evaluation Residual Program

To quote from

Partial Evaluation A pro cess that given a program and part of its input will pro duce a residual

program which when executed with the remaining input of the original program will givethe

same result as the original program would if it was executed with the complete input Partial

evaluation is also called pro jection Partial evaluation can b e expressed by the equation

L L mix program v v v v Lprogram v v v v

n n m n n m

where mix is a partial evaluator for the language L A dierentway of stating the same is that

Chapter Partial Evaluation Definitions and Known Results

resid Lmix program v v

n

implies

Lresid v v Lprogram v v v v

n m n n m

In analogy to the residual program there are residual functions created by sp ecializing single functions with

resp ect to some parameters

Dened in this way PE is a programtoprogram transformation that do es not change the language of

the transformed program but only its eventual runtime costs and the numb er of parameters it takes at

runtime Neither time nor space optimization is promised

The most trivial partial evaluator is the identity function

mix program v v v v program v v v v

id n n m n n m

SelfApplication and the Futamura Pro jections

To quote from

SelfApplicable Partial Evaluator A partial evaluator that is written in the same language that

it pro cesses giving the p ossibility of applying it to itself Also called an autopro jector

There are three interesting cases of selfapplication in PE First describ ed in they are commonly known

as FutamuraprojectionsGiven a partial evaluator mix written in L and transforming L programs an

L

interpreter int written in L and interpreting some arbitrary language and a program written in the

L int

language that int interprets Then the rst pro jection

L

L mix int program program

L L int L

is the specialization of an interpreter with regard to the program This equation do es not mention the

runtime inputs of the interpreter or the in terpreted program

L L mix int program inputs Lprogram inputs

L L int L

That is an interpreter takes two inputs a program that has to b e interpreted and a set of input data for

that program

The result of the PE is a program that has the same semantics as the original interpreted program but is

written in L instead of the int language A translation has taken place Again it shall b e mentioned that this

translation need not in anywayimprove the eciency of the translated program this equation holds for the

identityPE to o It just has the opp ortunity to b e ecient ie it is possible but not necessary that the run

time cost of Lprogram inputs are for some or all inputs lower than those of Lint program inputs

L L int

Even if some costs are lower most likely time others maywell b e higher usually space

The next pro jection is

L mix mix int compiler

L L L L

terpreter The result is a compiler ie Now the mix sp ecializes itself with regard to the in

Lcompiler program program

L int L

where the program is functionally identical to the one generated bythe mix int combination Just as the

L L L

rst pro jection has the p otential of optimizing the cost of running program this second pro jection can

int

Formally sp eaking applying the interpreter int to a program program results in a function that takes inputs

L int

Chapter Partial Evaluation Definitions and Known Results

optimize the cost of running the compiler The partial evaluator sp ecializes itself to the task of compiling

int programs

The third and last pro jection is

L mix mix mix compiler

L L L

L

The result is a compilercompiler That is

Lcompiler int compiler

L L

L

Here mix is sp ecialized to the task of compilergeneration There are no more pro jections b ecause a xed

L

p oint is reached Further selfapplications results in the same compiler

L

Internal Sp ecialization

ac is an application of the rst Futamura pro jection There is no separate PE and interpreter instead

interpreter and PE are mixed into one program The technical details and the motivation b ehind this

design decision are explicated in the following chapters

The PE employed in ac do es not implement full partial evaluation but merely internal specializationas

dened in

Internal Sp ecialization Sp ecialization of comp onents of a system to exploit their internal contexts

of use while preserving overall system functionality

OnlineOlinePartial Evaluation

Partial evaluators can b e dierentiated into online and oline PEs An oline PE transforms a given

program without actually running it ie it transforms a passiv e program image Online PE works by

running the program or simulating its execution and emitting residual co de while doing so Online PE is

thus a sp ecial kind of abstract interpretation where abstract values are generated that contain b oth value

descriptors and program co de Oline PEs usually consist of two distinct phases a bindingtime analysis

BTA and a sp ecialization phase

Static and Dynamic Binding Times

To quote from

Binding Time The time at whichavariableexpression is b ound to a denite value In partial

evaluation variables b ound known at partial evaluation time are called static as opp osed to

dynamic Alternatively the terms known and unknown are used

Binding Time Analysis An algorithmic analysis that nds approximations of the binding times

of variablesexpressions in a program

A simple BTA can b e implemented using an abstract interpretation on a twoelement data domain f g

mo dels the values that are statically known those that are dynamic BTA b ecomes interesting when

partial ly static values have to b e considered

Rep eated selfapplication is indeed a measure for the quality of a PE system An optimal PE should generate identical

copies of itself mo dulo renaming up on selfapplication

Chapter Partial Evaluation Definitions and Known Results

Partially Static Values

To quote from

Partially Static Value A structured value that contains b oth static and dynamic parts

In ALDiSP these structured values are mostly lists and higherorder functions Functions are data struc

tures in that they consist of constant co de and an environment These environments will contain bindings

to static and dynamic values and to other functions Annoying problems o ccur if partially static values can

b e of p otentially innite size streams recursive pro cedures Imp ortant sp ecial cases of partially static data

structures are streams In many stream applications constant prexes o ccur If these are treated sp ecially

unwanted lo op unwinding for the rst k might o ccur The opp osite eect can also create problems certain

optimizations might b e imp ossible when a partially static value is approximated as totally dynamic

Sp ecialization

The sp ecialization phase of an oline PE surveys all function denitions and their call sites and either

expands unfoldsinlines the function applications ie substitutes a call site with the functions de

nition

specializes a function ie removes a parameter that is b ound to a static value in an application and

generates a residual function in which that parameter is replaced by that value or

leaves the function call interface as it is

Both excessive unfolding and sp ecialization can result in code explosionintheworst case nontermination of

the PE

Generalization

A generalization step is often p erformed to avoid oversp ecialization the abstract values encountered in

dieren t call sites of the function will b e merged so that a more general call interface is found that still leads

to the same sp ecialized function b o dy The main goal here is the minimization of co de space

FoldingUnfolding

To quote from

Unfolding The substitution of a name by the structure it denotes Most often used to describ e

the substitution of a function call with an instance of the denition of the function

Folding Up on recognizing that a conguration is similar to a previous conguration a recursive

denition can b e made by dening a function or predicate denoting the previous conguration and

making the new conguration into a call or reference to that denition If the two congurations

are not exactly identical a generalized denition encompassing b oth congurations can b e made

Chapter Partial Evaluation Definitions and Known Results

While b oth denitions are formulated in a general way the structure or conguration nearly always

refers to functions and expressions containing function calls They could as well refer to data structures but

those are usually directly sp ecied by the user and less malleable

A program transformation system written by Burstall and Darlington was the rst to use the

foldunfold op erations as primitives to p erform extensive program manipulation This system established

the fact that these two transformations together with reduction ie application of primitive functions

can b e combined to accomplish all p ossible program transformations

Unfolding is a generalization of function inlining Most compilers that p erform inlining restrict it to very

simple functions eg functions that comprise one basic blo ck

Folding is a generalization of common sub expression elimination CSE Here the congurations are

expressions and the denition is the intro duction of a new variable holding the value of the multiply

used denition While CSE can b e done in nearlinear time by employing hashconsing strategies

generalized folding has a much higher complexity

Fixed Points

One of the central problems of AI is the computation of xedpoints or xpoints of recursive functions In

the context of PE this means the mo delling of p ossibly nonterminating executions traces ie dynamically

b ound lo ops and recursive function calls Since an abstract interpreter should terminate for all programs

even programs that are nonterminating for all inputs a nave simulation has to b e ab orted by a timeout

mechanism returning an approximation to the real semantics

The xed p oint theorem suggests a direct way to compute the FP namely o return the rst time a function

is called recursively and to iterate with the result thus obtained un til it do es not change anymore

Finding a xp ointisavery timeconsuming pro cess the worstcase complexity for rstorder functions is

exp onential in the depth of the abstract domain

ApplyingtheTechniques of Abstract Interpretation to ALDiSP

The next chapters will presenteach of these steps abstract interpretation abstract scheduling co de recon

struction and machine co de generation

At the core of the ALDiSP compiler lies an abstract interpreter for the LF language This abstract interpreter

works on a domain that is detailed enough for constant folding yet abstract enough to terminate on all input

programs The abstract interpreter is derived from the standard semantics

The AI consists of two largely indep endent mo dules To separate the pure functional subset of the

language from the realtime parts an abstract scheduler handles the set of p ossible states a program can

encounter To the scheduler the abstract interpreter is essentially a blackb ox subroutine that transforms one

state into another byevaluating one susp ension The scheduler cho oses which susp ension when to evaluate

and how to merge states that are similar

To generate co de the abstract values which are generated within the abstract interpreter are given code

attributes The idea is that every output value generated by the AI pro cess has a computational history

attached to it from which the co de necessary to compute it can b e generated These co de attributes are not

visible to the AI or the AS ie they have no semantic relevance

In many languages eg C data structure denitions may not b e rearranged or otherwise changed by the compiler at all

Chapter Partial Evaluation Definitions and Known Results

Based on the co de attribute is the code generationwhich reconstructs the program from the co de attribute of

the abstract results The reconstruction phase takes the graph of states whichwas generated by the abstract

scheduler and generates for each transformation b etween two states a piece of co de that implements that

transformation at runtime This co de the Code Form or CF is wholly disjoint from the LF co de on which

the AI was run It mostly resembles singleassignment dataow languagess likeSILAGE orSISAL

Finally this CF program is converted into a assemblylevel output format the Machine Language or M

which can b e translated into any imp erative co de needed A sample C backend has b een written

Chapter

Abstract Interpretation of LF Expressions

This chapter intro duces the basic mechanisms of abstract interpretation AI Using an example some

problems are discussed in detail The example uses standard functional language notation that corresp onds

to a functional subset of the Lambda Form The example is interpreted in an abstract data domain that

utilizes interval arithmetic

In the following all problems related to the treatment of state are ignored Chapter will showhow the

abstract scheduler AS manages the set of p ossible states a program can encounter The abstract scheduler

provides the sup erstructure of the compiler the AI is a subroutine of the AS The AI can ignore the

state since the abstract interpreter is always invoked in a context where the state is xed The lo op detection

cf section ignores functions that change the state lo ops that go across states are handled by the

state lo op detection scheme that is part of the abstract scheduler cf section

Goals of Abstract Interpretation

The goal of AI is generally sp eaking to acquire information ab out functions by executing them on abstract

data Information can b e gathered b oth during the execution or afterwards by analyzing the results

Furthermore b oth forward and backward AI is p ossible forward analysis mimics the normal evaluation

function backward evaluation starts with a result and recreates the p ossible execution traces that might

have generated it Forward AI is the more natural approach since it closely corresp onds to normal

interpretation Some analyses are most naturally implemented in a backward framework The AI

employed in ac is of the forward kind since it has b een develop ed byevolutionary means ie by extending

the standard semantics Necessary backward optimizations are p erformed in separate optimization steps

cf section

Dierent kinds of information can b e obtained by AI

How often is a function called

What are its arguments

In the term program p oint is used for those programming constructs that are analyzed by an AI Program p oints

must have a dened argumentresultinterface For example the entry p oint of a lo op construct in an imp erative language

ts into this category and an AI could determine how often a lo op is entered or whichinvariants hold for all lo op entries

In ALDiSP as in most functional languages userlevel functions provide the only program p oints of interest since lo ops are

expressed by tailrecursive function application

Typical backward applications are deadco de elimination the related strictness analysis and compiletime

garbage collection

Chapter Abstract Interpretation of LF Expressions

What do es the call stack lo ok likeateachinvo cation

How are the arguments used Is the function strict in all its parameters Are some arguments never

used or only rarely

Howmuch timespace do es the functions execution consume

Is the function recursive If yes is there a b ound to the recursion depth

In general AI do es not guarantee to deliver exact results to do so would require excessiveoreven innite

run time since all p ossible combinations of arguments to a given function would havetobesimulated

Esp ecially in the presence of recursion AI faces the problem of nontermination If a p ossibly nonterminating

execution has to b e simulated the abstract interpreter will cho ose some appropriate approximation or

generalization to ensue termination Such generalizations can b e guaranteed to b e correct but are usually

imprecise information will b e lost

In the context of ac the abstract interpreter has the additional task of computing co de parts of the

partial evaluator are implemented within the AI framework and each abstract value has a code comp onent

that denotes the CF co de needed to compute it at runtime chapter gives the details

Abstract Values

General Remarks

Probably the most imp ortant issue in designing an abstract interpretation scheme is the choice of an abstract

value domain There are two conicting goals in cho osing this domain

The size and run time complexity of the AI will b e adversely aected by a highly rened abstract value

domain It is also more work to correctly implement it

The precision of analysis will suer if the abstraction is to o coarse

Eac h abstract value represents a set of concrete values One metho d to characterize an abstract domain is

by dening two functions the abstraction function and a concretization function The abstraction function

maps the standard data domain to its abstract counterpart the concretization function do es the opp osite

The concretization function is usually a relation ie abstract data ob jects typically represent more than one

concrete ob ject

The abstractionconcretization approach do es not takeinto account that there may b e orthogonal at

tributes attached to the abstract value ie attributes that leavetheevaluation semanticsofthevalue

unchanged but convey extra information These attributes are often inserted by primitives andor syn

thesized by the abstract interpreter One example at hand is the code attribute used in ac other p ossible

uses of such attributes are enco ding storage arena reference count or strictnessannotations In the

following discussion such attributes can b e ignored

The most simple kind of abstract value is a nonabstract value an abstract value domain can b e constructed

on top of the standard value domain This approachavoids loss of information since all completely

determined computations ie all computations involving only standard values will deliver concrete results

The most precise abstract value domain is one that employs abstract values that represent arbitrary sets of

standard values The standard semantics can then b e directly extended to abstract values

Each function on abstract values can b e reduced to its counterpart on nonstandard values byenumerating the p ossible

argument sets represented by an abstract argument list applying the standard function to each of these lists and merging the standard result to one abstract result

Chapter Abstract Interpretation of LF Expressions

The problem with this general abstract value typ e are its implementation cost Esp ecially when the concrete

data domain contains nonnite typ es such as data structures and functions arbitrary sets of these cannot

b e represented extensionally even nite concrete typ es incur prohibitive costs exp onential in the bitwidth

needed to represent an instance of the concrete typ e The run time of the primitive functions grows even

worse since argumentcombinations are to b e considered

Instead of implementing all subsets of a standard domain most AI systems employa predetermined set

of subsets One obvious choice for such subsets are the data types for eachtyp e one abstract value is

intro duced to mo del the set of all values of this typ e In the currentversion of ac general set abstract

values are not supp orted and all abstract values either mo del one concrete value or all values of a data

typ e

Domains

A domain is a set with an internal structure given byanpartial ordering relation v ie a relation that

is reexive antisymmetric and transitive A domain is a complete partial order CPO if there is a smallest

element For the purp oses of abstract interpretation the data domains of a programming language must

b e CPOs

The ordering relation is conventionally interpreted as a v bia is less dened than b The most

dened value if it exists is denoted by the symbol top the least dened value is b ottom

Abstract interpretation is a pro cess in which the least upp er value for an expression is computed If the

expression is recursive as in the case of a recursive function application an iterative pro cess is needed

to approximate the value Starting with a rst guess of b ottom each iteration step will improve the

approximation Formally sp eaking improving means that each new approximation is at least as dened

as the previous one The top element if it exists is a trivial solution for each expression with no information

content whatso ever

Such an approximation pro cess will b e shown in the example of section in detail

Basic Domain Constructions

Any set S can b e interpreted as a domain having the minimal ordering relation the equality relation

is also called the discrete ordering

Any domain S can b e turned into the CPO S by providing a distinct element and dening the ordering

 S

as

x y S x v y x v y x

 S S S

Ie in addition to the ordering derived from S eachelement is less than the newly intro duced Sucha

S

S denition is known as lifting the domain S IfS is a set S is called a at domain

 



Analogously the domain S has the extra element with

S



x y S x v  y x v y y

S S

S



Combining these two constructions S can b e dened as





x y S x v y x v y x y

S S



In many abstract interpretation applications at domains are all that is needed In denotational semantics

ternal structure are necessary eg to represent rstorder for programming languages domains with more in

functions or recursive data structures

An earlier version provided them only for very small sets less than four elements and for sets that could not b e generalized

to a common data typ e more sp ecic than 

The restriction to CPOs is only necessary if recursive structures have to b e given a semantics

Chapter Abstract Interpretation of LF Expressions

Basic Examples

There is some confusion as to the interpretation of semantic domains in terms of sets esp ecially where the

top and b ottom elements are concerned Some of this can b e cleared up by lo oking at some simple domains

and by considering the rules used to combine domains

T F MAXCARD

a b c

Figure a p oint domain b p oint domain Bo ol c MAXCARDp ointdomain Card

The most trivial useful domain depicted in g a consists of just twovalues and This domain

is simply called The domain is often employed when only one relevantcharacteristic of each construct

is of interest eg in dataow analyses such as strictness analysis

The simplest domain that includes real values and the complete set of and elements is the b o olean

value domain



f T F g Bool fT F g



cf g b In the approximation pro cess of the abstract interpreter these values are used as follows

b ottom a value without any prop erties

T thisisthevalue of an expression that if it terminates evaluates to T true

F thisisthevalue of an expression that if it terminates evaluates to F false

topavalue that means b oth T and F

That is combines all prop erties of the values b elow it in contrast has none of the prop erties of the

values ab ove it It is imp ortant to note that during a xp oint computation all approximation steps go

upward toward top but never downward toward b ottom

Practically sp eaking and convey ab out the same amount of information we dont know what the actual

value is but due to this directedness of values a b ottomvalued expression still holds the promise of nally

b eing approximated to a distinct value T or F In contrast a topvalued expression cannot b e lowered

anymore

The domains  the empty set and  containing only  are quite useless

This is really the basic idea of the xp oint theorem If the domain of values is of nite height the longest path b etween

b ottom and top is of nite length and all functions are monotonea xpointforanyvalue exists and can b e found in a nite

numb er of steps

Chapter Abstract Interpretation of LF Expressions

To use a dierent example we can reinterpret the Bool domain instead of b o olean values T and F can

b e used to mo del even and o dd numb ers Then is a numb er which is neither even nor o dd T describ es

anynumberthatiseven F describ es anynumberthatisoddand describ es anynumberthatisboth

even and o dd

Domains can often b e interpreted as p owersets Under suchaninterpretation is the empty set T the

set of even numb ers F the set of o dd numbers and the set of all numb ers

The last example is Card a at cardinal domain consisting of a set of numbers M AX C ARD together

with and cf gure c It is explicated just like the b o olean domain stands for the number

without any prop erties ie the empty set eachnumb er stands for itself ie a singleton set and stands

for the numb er that merges the prop erties of all other numb ers ie the set of numb ers If weinterpret

each element as a set the partial ordering is nothing but the subsetrelation

Union of Domains

Two or more domains D D can b e merged to a domain C by dening new top and b ottom elements

n

and andby dening

C C

x D i n x v v x

i C C C

x y D i n x v y x v y

i D C

i

This corresp onds to the union op erations on sets The denednessrelations within the C domains are left

i

untouched and values from C are incomparible with values from C assuming i j

i j

In such a hierarchic domain structure the dierent top and b ottomvalues appropriate a meaning with ner

distinctions Now each has conveys some typ e information which do esnt contain Likewise

D C C

i

lost the typ e that each contains

B

i

int bool

MAXCARD T F

int bool

Figure Union Domain Bo olCard

Figure depicts the merging of the domains Bool and Card

The Ideal Abstract Domain and its Subsets

Obj

Assuming a given data domain Obj taken from a language semantics the ideal abstract object domain A

Obj Obj

is dened as the p owerset of Obj ieA BecausejObj j tends to b e quite large often even innite

Chapter Abstract Interpretation of LF Expressions

Obj

for nontrivial typ e systems A is even more so thus it is generally imp ossible to represent all p ossible

Obj

Obj

elements of A either explicitly or implicitly A simplied abstract ob ject domain A has to b e used

simpl

Obj

instead Having this in mind all abstract domains can b e viewed as predetermined subsets of A

Figure shows an ideal abstract domain structure for the value data domain Card The top element

corresp onds to the set f g the b ottom elemeent to the empty set Elements of this domain could b e

represented by bit bitvectors

Note that there are no distinctions made at the level b elow the oneelement abstract values While it

is p ossible to mirror the structure ab ove the oneelementlevel there is no need for it The two basic

op erations that create abstract values are

injection of data values this creates oneelement abstract values and

nding the least upper bound of two ormorevalues this merges the arguments into a value higher

up in the hierarchy

The b ottom value itself is sometimes injected as a sp ecial unique constant that initializes approximation

chains and serves as a default value for erroneous computations

For larger data domains the ideal abstract domain cannot b e eciently represented In the case of bit

numb ers each abstract v alue would b e represented a kbyte bitvector Even worse arithmetic primitives

would take a time at b est prop ortional to the numb er of bits at worst prop ortional to the squared number

of bits in the case of binary op erations

Obviously nontrivial data domains call for simplied abstract domains

Card

Figure Ideal Abstract Domain

One practical approximation employs ranges

Int

A fa b ja b Int a b gf g

range

A range ob ject a b isinterpreted as the set fx jx a x b gAninteger n is represented as n n

and need not b e extra elements can b e interpreted as the unique emptyinterval and as the

interval MinInt MaxInt

Chapter Abstract Interpretation of LF Expressions

Likewise an abstract representation of the IEEE real numbers

n

Float fm j m Int n Int gf NaN g

mantwidth expwidth

could b e made using values from the domain

Float

A fa b j a b Float a b gf g

range

This direct range domain has some small problems namely ranges involving the unreal numb ers f

NaN gItmight b e b etter to split up the Float domain and dene

n

Float fm j m Int n Int g

mantwidth expwidth



Float

A fa b j a b Float a b gf NaN g f g

range

There is a pitfall in designing abstract domains esp ecially when numb ers are involved one could argue that

there are values of sp ecial interest that ought to b e handled separately eg the set

n

Frac f j n m Int g

m

or even things like

n

e j n m Int g Frac f

e

m

n

Frac f j n m Int g

m

By adding such abstract values though one can easily create situations in which the abstract interpretation

will show a dierent b ehaviour than the standard interpretation due to dierences of rounding and precision

Obj Obj

Using A sinsteadofA s results in a loss of precision but one can usually adapt the approximationto

range

the needs of the application In the context of abstract interpretation such a loss takes the form that the

actual result is less precise than the optimal result ie res v res

optimum actual

Size of Domains

Many applications most notably strictness analysis and binding time analysis mentioned in can achieve

signicant results using only the most primitive of domains the p oint domain

One motivation b ehind keeping domains small is the exp ense incurred by complex abstract domains imple

menting the primitive functions b ecomes an arduous and errorprone task and the time needed to compute

xp oints may grow exp onentially in the depth of the domain Depth here means the length of the

Int

longest chain of abstract values For example the A domain of ranges over bitvalues has a

range

depth of since

Similar problems o ccur whenever oatingp oint expressions are reordered or otherwise optimized at compile time Only

a few of the standard algebraic transformations are valid on oatingp ointnumbers

Most functions that o ccur in practical programs do not b ehave exp onentially under iterative approximation On the other

hand even linear growth can b e quite sucienttomake AI impractical

Chapter Abstract Interpretation of LF Expressions

The most simple abstract value domain in which sensible computations can b e p erformed is probably that

of basic types for each distinct typ e in the domain one abstract value is intro duced For the following

examples we could dene

Bool fTrue False g

Int f MININT MAXINT g

Obj Bool Int

Obj

A Obj fanInt aBoolgf g

basic

Here the depth of the domain is

x Obj x anInt

int

x Obj x aBool

bool

A more precise approximation could b e one oriented along machine types

Obj

A Obj fBit Byte Word Long g

machine

with Bit aBool and a depth of

x f g x Bit Byte Word Long

A go o d compromise b etween eciency and precision and the one employed by the abstract interpreter in

ac is a domain built around the bitwidths of numeric values

Obj

Obj fInt n jn maxwidth gfCard n jn maxwidth gfaBool g A

width

Sources of Abstract Values

Where do the abstract values come from It is certainly imp ossible for a source program to contain abstract

literals and all nonsideeect op erations taking concrete arguments deliver concrete results

There are three sources of abstract values

Sideeecting op erations esp ecially input functions In ALDiSP a primitive function like read applied

to eg a Cardinalp ort will return an abstract value representing all p ossible unsigned bit

values

Typ e annotations If a function is annotated with explicit argumenttyp es that information can b e

used to synthesize abstract values This approach is used when abstract interpretation is done on a

functionbyfunction basis ie when not all call p oints are known in advance In a framework that

supp orts separate compilation an abstract interpreter will have to use such a strategy for all functions

that are exp orted

With the p ossible exception of nondeterministic functions like random or choose Since no such function is part of the

ALDiSP denition this can b e ignored

For the purp ose of equality tests sucha value should b e annotated with a source attribute eg the value of input p ort

x at time t In ac this is implied by the atomcode attribute cf section

Chapter Abstract Interpretation of LF Expressions

Manually inserted values Some exp erimental partial evaluators are targeted at manual intervention

a lot of the published work b elongs to this class

In ac input primitives are the sole initial source of abstract values The program is abstractly interpreted as

a whole libraries are linked by textual inclusion of their source so the p ossible call p oints for all functions

are known at compile time The abstract interpreter is not visible to the programmer therefore manual

intervention is not needed

An Example

Consider the wellknown example function

func fakn

if n then

else nfakn

end

and how the AI may pro cess it For a start we assume that the AI works with an abstract domain of b o oleans

and integer intervals

A fa b j a b a b Int gffalse true aBool g f g

example

Let fak b e called with the argument ie some unknown value b etween and inclusive

Evaluation starts o with the comparison which reduces to aBool What is the if to do

Abstract Conditionals

When the test of a conditional expression do es not evaluate to a concrete result all p ossible branches covered

by the abstract selector must b e evaluated and the results must b e merged Abstract values are merged by

applying the concretization function to them building the union of the resulting sets and nding the b est

approximation from the domain of abstract values ie that abstract value which maps to the smallest set

of standard values of which that union is a subset For example merging and would result in

Other values eg or would b e as correct but needlessly imprecise In the context of

the partial order of the domain the merge op eration corresp onds to the least upper bound lub on domains

ie the least dened value from which and are sp ecializations The least upp er b ound is denoted

by the inx op erator t

While the basic idea b ehind the treatment of conditionals is obvious there are a numb er of practical problems

In the example b oth branches of the if must b e evaluated evaluates to the singleton interval and

n evaluates to

Here the rst problem emerges In real execution n can never evaluate to this is consequence of the

test n being false in the else path The abstract interpreter has ignored this precondition What are

the consequences of this loss of information

fak will b e called recursively with arguments

It might however b e visible to the library writer The library can communicate with the compiler by employing primitive

functions that directly manipulate the abstract interpreter heuristics or that dep end up on the compilation state

While determining the set of p ossible branches is trivial in Lcond expressions it b ecomes interesting in the case of

Lselect expressions and numeric range abstract values

Chapter Abstract Interpretation of LF Expressions

When evaluating and all following comparisons the result will b e false the AI will not

terminate on fak while the standard interpretation of fak terminates after a maximum of

recursive calls

This is one consequence of information loss terminating lo ops may b ecome nonterminating

What happ ens if the abstract interpreter is smart enough to split the abstract value of n in the dierent

execution paths of the if The initial value of can b e split into and so that fak is

called with and safely terminates Now the merging starts b ecause for all nn

merged with is n the merges with can b e ignored The result of the call chain will b e

ie This is indeed a p erfect result not quite as go o d as p ossible

b ecause the range representation denotes num b ers while there are but p ossible real

results but the b est we can hop e for within the context of the chosen abstract domain

A splitting mechanism would thus b e a very go o d thing to incorp orate in the AI its implementation

though is in general imp ossible There are two reasons for this rst the inverse function of a predicate

has to b e computed If we allow arbitrary recursive predicates and innite domains this is imp ossible If

we restrict ourselves to nite domains the inverses can b e computed but not necessarily represented an

ideal abstract domain would b e needed for this In practice this second problem is far worse than the rst

one at least in DSP programming conditions tend to b e simple comparisons and nonrecursive predicates

But even the most trivial predicate comparison against a literal can easily create splittings that cannot b e

represented by an abstract domain of consecutive subranges

It can therefore not b e assumed that a splitting device exists Since an abstract interpreter will always

need some safetymechanism to stop runaway lo ops after all a source program maycontain a genuine

nonterminating function and an AI should terminate on any input we assume that some kind of loop

detector guards all function calls

Lo op Detection Finding Fixed Points

In ALDiSP and most other functional languages lo ops can only o ccur through recursive function calls

It is therefore guaranteed that any lo oping path of control has to pass at least one function application

A lo op detector installed at the apply p oint can therefore preempt all nonterminating b ehaviour thus

guaranteeing the termination of the abstract interpreter

We assume that the lo op detector decides that fak is suciently similar to the still unnished

evaluation of fak to call it a lo op

terpreter therefore ab orts the further evaluation of fak and returns at once with The abstract in

a faked return value the empty result set is chosen as the initial value since it is the weakest or

least dened approximation to the real as yet unknown result

Evaluation continues with the op erator is dened as This b ottompreserving b ehaviour

is one of the foundations of the semantic machinery it can also b e derived from the denition of abstract

values via sets

The evaluation of b oth arms of the if is now nished the results are combined t since

is the neutral element for t

Thus ends the rst attempt to determine the value of fak The result is somewhat dis

app ointing But the pro cess can b e iterated The crucial step returning as the value of fak

was only a rst approximation now can b e returned since we think that is similar to

w e assume that fak is also similar to fak this is still not the correct result

but b etter than is The approximation continues the next cycle delivers

as result of fak is Nowwe see why the depth of the abstract value

Chapter Abstract Interpretation of LF Expressions

domain must b e nite if there is no upp er b ound to the size of an integer and thus to the depth of the

domain the approximation will not terminate The domain can b e restricted to nite depth bya number

of means As far as numeric representations are concerned the base domains are nite in practice so the

intro duction of an upp er limit MaxIntmakes sense Any left or right side of an abstract result exceed

ing this is transformed to an interval containing MaxInt Primitive op erations are closed over this system

eg MaxIntMaxIntisMaxInt The fak example would then return an abstract result of MaxInt

The result MaxIntisaxedpoint of the fak function assuming

x Int f x MaxInt

it can b e veried that

x Int if x then else n f x end MaxInt

If there are only nite approximation c hains approximation by iteration is guaranteed to nd a xed p oint

in nite time

Sp eed

The mere existence of a xed p ointdoesnothelpvery much if the approximation takes a long time b ecause

it consists of to o many steps say MaxInt While it is encouraging to knowthat

the partial evaluator will eventually stop it is pragmatically unsatisfactory if there is no small upp er b ound

to the numb er of iterations it takes

The computation of xed p oints is of a complexity exp onential in the depth of the abstract domain

Metho ds of sp eeding up the iteration pro cess are describ ed in the literature It is usually quite easy to nd

a safe approximation to the xed p oint if the domain contains a top element this element is suchan

approximation But of course no knowledge at all is gained by this approximation mo dels any result

A practical way to compute the xed p oint of a function indeed the approachtaken by ac istoswitch

to a coarse abstraction when to o much time has b een sp ent on a call The coarsest useful abstraction

level is that of implementation typ es ie the typ es that corresp ond to basic machine representation

categories

The heuristic that is implemented in currentinterpreter to b e more sp ecic in the similar function

group

tlistsworks as follows if a certain it denes the similarity measure and generalization of groups of argumen

depth of the call stack is exceeded the argument lists are generalized to their basic typ es This threshold

is usually set to values b etween and If all arguments are already at the level of basic typ es or if a

second threshold usually set to is exceeded the arguments are generalized to Thus termination is

enforced

Generalizing the ArgumentLists

The example as explained ab ove did include some handwaving what is the legitimation of using the result

of fak as an approximation for the result of fak There is none b ecause a crucial step

has b een left out of the example namely argument generalization The lo op detector b elieves to have found

a lo op when there is a call to a function f with an argumentlista a when another call to f with a

n

The exp onential worstcase complexity is for rstorder functions the situation for higherorder functions is even

worse

Chapter Abstract Interpretation of LF Expressions

similar argument list a a is already on the call stack The approximation pro cess returns an answer

n

for

f a t a a t a

n

n

ie for b oth calls the one at hand and the one in the call stack Since it is always correct to replace an

abstract value a by an abstract value a if a v a it is correct to rerun the approximation pro cess On

each run the argumentdomainmay b ecome less sp ecic since the initial argument list is generalized each

time but a xed p ointwilleventually b e found

Conclusion of the Example

These twomechanisms handling of conditionals and nding xed p oints through iterativeapproximation

and generalization are sucienttoconvert a standard semantics interpreter for a rstorder language into

an abstract interpreter Sections will show sp ecial problems that do not emerge in a rstorder

staticallyb ound function like fak

Implementing the Lo op Detector

The presence of a lo op detection scheme is probably the most imp ortant dierence b etween a simple and an

abstract interpreter As already mentioned suchascheme is based on a function application cache

The call cache keeps track of all applications of closures Only closures have to b e considered since appli

cations of primitive functions and arrays cannot give rise to lo oping b ehaviour and overloaded functions

are eventually reduced to closures For each application the call cache has to know the argument list the

current state and the exception environment State and exception environment can b e handled as implicit

arguments to a function call

Each call in the cache is marked as either op en closed or approximated The call cache can b e

describ ed as a partial function

CallCache func args context status

with

status fundefined open closedresult approxresult g

The undefined status is only needed to makethe CallCache function total

When a function is rst called with a given list of arguments it is marked op en The call is closed

up on return from the function the return value can but need not b e memoized A lo op is detected

whenever a function is called with a set of arguments for whichitismarked op en The call cache can either

b e implemented as a global data structure maintained by sideeects or in a more functional manner by

keeping track of the current call stack in the interpreter The latter approach while b eing more elegant

and general has some drawbacks

In typical DSP programs nearly all of the program co de consists of rstorder function denitions and all but an insignicant

amount of the actual run time is sp ent in these functions The example can therefore b e treated as representative

It is interesting to p onder a design that lives without sucha cache One viable alternativeistointegrate the xp oint

op erator cf sec into the lo op detection scheme The xp oint op erator is usually considered as a device that ties a

knot into the environment structure It can also b e understo o d and implemented as a function that generates new incarnations

of a recursive function by need A xp oint function that knows ab out the argument and result typ es and keeps track of its the

previously generated sibling incarnations can then replace the actual co de of the function b o dy with a call to a matching already

generated incarnation There is still a small residue of the call stack but it can b e lo calized in each functions xp oint op erator

incarnation An implementation with known and xed strategies and heuristics mightemploy sucha integrated approachto

semantics lo op detection and co de generation For the present implementationwhich is rather exp erimental and op en with

resp ect to new heuristics and strategies this approachwas considered to o complex and not mo dular enough

Chapter Abstract Interpretation of LF Expressions

eachinterpreter function has to b e extended to handle an extra parameter but that can b e incorp orated

into the catchall Context

there is the danger of a space leak the call cache is easily the biggest data structure constructed

during the AI If it is kept functional old copies of it mightbekept around far b eyond their needed

lifetime Even with call cache implemented by sideeects the AI consumes vast amounts of space

single runs easily allo cate Mbyte a functional cache could only worsen allo cation b ehaviour

Since most of the contents of the call cache are needed anywayby the co de reconstruction phase cf chapter

the cache can as well b e held as a global data structure

Nave Lo op Detection Schemes

First a nave lo op detection scheme is presented It is capable of handling direct lo ops correctly and

eciently

The call cache is assumed to b e globally accessible and mo diable with the op erator The iteration control

is given as an extension to the standard semantics The function apply of the standard semantics is

closure

renamed to apply and a new apply is dened to b e

closure

closurestandard

apply fargsC

closure

case CallCachefargsC

of open CallCachefargsC approx

resultS

approxr

closedr r

undefined CallCachefargsC open

apply fargsc

approx

apply fargsC

approx

r apply fargsC

closurestandard

case CallCachefargsC

of open CallCachefargsC closedr

r

approxa a r CallCachefargsC closedr

a

approxa a r CallCachefargsC approxr t a

apply fargsC

approx

When a lo oping function is called the rst time with a new set of arguments its call cache entry is undened

A new entry is marked as open and the standard application is computed If a second call with the same

arguments happ ens during this op en state an initial approximation isentered into the cache and the

evaluation ab orts with this value All succeeding lo oping calls will directly return the approximation value

without changing the cache

Up on return from apply the call cache status is insp ected If there is no change ie the

closurestandard

entry is still marked open no lo op has b een detected during the application and no approximation is

necessary the cache can b e closed the result memoized and returned

If there is an approx entrythatentry has to b e compared with the actual result r If they are identical

the approximation is stable and the entry can again b e closed If the approximation is not stable the

new approximation is merged with the old one and apply is called again No generalization of the

approx

arguments lists as describ ed in section is necessary since a lo op is only detected up on total argument

matches

Chapter Abstract Interpretation of LF Expressions

Problems with the NaveScheme

For several reasons the recursion approximation scheme presented in the previous section is unable to cop e

with all programs

Not Enough Generalization

The biggest problem is the treatment of near misses ie small variations in the argument lists Take for

example the FindFirst function which nds the index of the rst o ccurrence of a value in a vector

func FindFirstobjSamplevecVectorSample this aldisp function

let will really tax a

func loopiCard naive loop scheme

if i sizeOfvec then

veci obj then i

else loopi

end

in

loop

end

Assume that the size of the vector is not known at compile time ie that it is input dep endent then

i sizeOfvec will evaluate to aBool every time Under the navescheme the abstract interpretation

of the loop function will never terminate since loop is called with a dierent argument each

time

If the vector size is statically known the navescheme will lead to a total unrolling

A simple solution to this problem consists of p erforming a generalized comparison ie to replace the

case CallCachefargsC

of

in apply with a

closure

if argssimilarargsar gs CallCachefargsC then

The similarity relation can b e dened in terms of a generalization function

similarargsargs

args args i argsgeneralizeargs generalizeargs

i i

One obvious choice for a generalization function is the base typ e function Two argument lists would then

b e considered similar when their corresp onding elements are of the same base typ e

Too much Generalization

But it is not enough to replace the equality test with a similarity relation after all wewant some level of

lo op unrolling in the abstract interpreter Consider the function

The interpretation will not even stop at or since the literal is not typ ed the counting will therefore b e done

in inniteprecision arithmetic

This proves to b e catastrophical in practice since the abstract interpreter will most likely exceed the resource limitations

of the system it runs on if not in time then in space In ac unrolling up to or iterations seems to b e reasonable

Chapter Abstract Interpretation of LF Expressions

func powxn

if n then

n then x

else

let

powxx powxxn div

in

if n mod then powxx

else powxx x

end

end

end

A function like pow is a prime target for inlining since the second argument is often known at compile time

and unrolling removes most of the exp ensive co de division and mo dulo comparison and conditional

A call to pow with known second argument eg powx should later expand to CF co de cf chapter

powxx x x

powxx powxx powxx

powxx poxxx x

powxx

but under a similarity relation based on generalization to base typ es no such unrolling would o ccur On most

target architectures the generalized co de will b e much more exp ensive than the unrolled ie sp ecialized

version To facilitate unrolling the lo op detector should only b e triggered when some depth of recursion is

exceeded To accommo date this the apply function has to b e extended to handle groups of arguments

closure

lists

if args args i nCallCachefargs C open

n i

similar argsargs args matchthreshold then

group i

similar can b e built as an extension of the similar predicate it should takeinto account the size of

group

the set to b e compared On a large set the pressure to nd a lo op is bigger and a more general similarity

predicate can b e used than on a small set As a limit if an exact match is found a set of size is large enough

for the similarity to b ecome If the set is b eyond a certain size say strong generalization up to

can b e allo wed While no realistic program will have a need for generalization ie generalization b eyond

the basictyp e level its presence guarantees that AI eventually nds a lo op in all recursive functions

Mutual Recursion

What happ ens if two or more functions call each other recursively Consider the case of simple mutual

recursion as exemplied by the evenodd example

func evenn if n then true

else oddn

end

func oddn if n then false

else evenn

end

When even is called with an abstract argument the ensuing call stack will consist of alternating applications

of even and odd The xed p oint nding mechanism may b e triggered at the k th call to even for some

This is not a guarantee that the AI as a whole terminates since the set of p ossible functions is not nite By generating

new functions ie closures on the ynontermination is still p ossible A second source of nontermination lurks in the implicit

dereferencing which can cause lo ops not caused by recursive function applications

Chapter Abstract Interpretation of LF Expressions

k dep ending up on the internal structure of the abstract value b ound to n and the heuristics employed by

the similarity predicate In each iteration of the xed p oint nder odd will b e called and a value will b e

returned Since that value will only b e an approximation it would b e erroneous to store it in a closed call

cache entry CCE for odd if this were done following iterations might b e ab orted b efore encountering the

new approximation It is therefore necessary to ush the call cache of all entries that were constructed

during an approximation iteration

After the iterations have found a xed p oint the CCE for even will b e closed its code attribute will refer

to a CCE for oddwhich is also closed and which is dened in terms of the recursive CCE of even odd

mighteven b e inlined since it do es not refer to itself but only to even

Multivariant Sp ecialization

Some functions can havetwoormorespecializations A sp ecialization is characterized by an argumentset

for which more ecient co de can b e generated than for other argument sets Consider this transformation

of the evenodd example into one function with a b o olean parameter

func eoneflag

if eflag then if n then true else eonfalse end

else if n then false else eontrue end

func evenn eontrue

func oddn eonfalse

An application of even or odd to an unknown n will result in recursive calls that can b e sp ecialized in

tw o distinct ways since eflag is always known co de has to b e generated for only one of the forks of the

if eflag condition That is sp ecialization should generate the original eveno dd function pair since the

eflag parameter can then b e dropp ed To facilitate suchmultivariant sp ecialization the generalization

strategy must avoid merging the eflag p osition of a generalized argument list to aBool In an abstract

domain that consists only of literals and base typ e values this would b e no problem since every third

recursive call to eo would b e an exact matchinany case but in a generalization that employs a more

detailed abstract value hierarchy eg interval arithmetic a match might o ccur only after a recursion depth

Since the similarity predicate employs the generalization function anywayitmakes sense that it also generates

the appropriate generalized argument lists similar is then of typ e

group

similar Argument list Set Similarity Argument list

group

Similarity

The similarity relation might then detect that the elements of a subgroup of the argument lists are more

similar to each other than to the other argument lists on the stack is and return generalized arguments for

that subgroup only

Restarting Approximations

In the nave lo op detector describ ed in section the rst iteration started on the wayback from the

p oint where the lo op was detected This is not p ossible in a generalized scheme since an arbitrary n umber

of recursion steps some of them unrelated eg through multivariant sp ecialization to the lo oping function

may reside on the call stack The generalized scheme therefore do es a restart from the b eginning each time

This example is obviously inecient When the lo op detector is applied to the optimized variation

if n then eflag else eonnot eflag endmultivariant sp ecialization do es not gain anything The example was cho

sen for its simplicity only As for most optimizing program transformations there is no guaranteed win in multivariant

sp ecialization In realworld cases candidates for multivariant sp ecialization can often b e found by lo oking for array index

parameters used with a constant stride or oset

Chapter Abstract Interpretation of LF Expressions

a lo op is detected Also all op en calls that are b etween the entry p oint of the lo op and the p oint where

the lo op was detected havetoberemoved since they would only p ollute the cache and misguide the lo op

approximation of other functions

How can the restart b e implemented One p ossibility is to store the continuation of each application in

the open call cache entry Restarting can then directly b e mo delled and implemented as call to the restart

continuation This approach has two problems

Memory consumption each open entry would contain the whole state of the abstract interpreter at

the p ointofapply Garbage collection of the underlying SML runtime system that supp orts

closure

the data structures of the abstract interpreter implemented for acwould b e unable to removesuch

old interpreter state Manual removal of dead open entries would b ecome necessary since in many

ALDiSP programs functions change state b etween lo op iterations For such functions apply lo op

detection is subsumed by the state lo op detection implemented in the scheduler

How is the currentcontinuation accessed The whole semantics would have to b e rewritten if the

apply needs an explicit continuation parameter Implicit continuations are not supp orted b y

closure

the semantics Even if they were SMLNJ has a callcc primitive but it is not part of the SML

standard and would render the compiler nonp ortable

To circumvent these problems ac resorts to a strategy dubb ed inband signal ling since the control infor

mation is enco ded within the data stream up on encountering a lo op computation is ab orted and a loop

exception is returned Each CCE is now uniquely tagged and the lo op exception contains the tag of the CCE

from whichthelooporiginates The originating CCE is determined by lo oking for the earliest entry among

those that are found to b e similar To make this op eration p ossible the CCE tags are timestamp ed Each

apply function up on recognizing the lo op exception value either invalidates its CCE if the tag do esnt

match or restarts the computation if it do es match

Generalizing Lo op Detection

The following lo op detection semantics contains all the changes describ ed in the previous sections The

similarity relation is assumed to b e sp ecied in terms of the generalization function No assumptions are

made ab out the generalization function

The call cache is now indexed by tags it is of typ e

CallCache Tag Function Arguments Context Status fundefinedg

g Status fopen closedResult approxResult

The apply function is now supplied as an extra parameter to the lo op detector which has b een

closurestandard

renamed call The new denition of apply function which mo dels the actual implementation used

closure

in the compiler is

applyfargsC callfargsCapply

closurestandard

By providing the actual primitive application function as a parameter to the call and loop functions

approx

the lo op detector can b e sp ecied and implemented within a separate mo dule This enhances mo dularity

and simplies testing since dierent strategies can b e attached to the same abstract semantics

Chapter Abstract Interpretation of LF Expressions

callfargsCapplyfn

if tag CallCachetag fargs Capproxr

approx approx approx

similarargs args

approx

then

CallCachetag fargs t argsCapproxr

approx approx

r

else

if match match then

degree threshold

tag new

new tag

CallCachetag fargsCopen

new

res applyfnfargsC

case res

of exception tagargs

lo op generalized

if tag tag then

new

loop applyfnftagCargs

approx generalized

else

CallCachetag undefined

new

res

else

CallCachetag fargsCclosedres

new

res

else

exception MinT args

lo op b estmatch generalized

where

match args similar args args args

degree generalized group s sk

T

T t t Max

b estmatc h b bj simset

Ttag tag

n

where

i nCallCachetag fargs Copen

i i

simsett t sim

s sk degree

where

sim args similar argsargs args

degree generalized group s sk

The call function rst lo oks for call cache entries that match the function to b e applied and the current

context state and exception environment If there is suchanentry and it contains an approxresult

value evaluation ab orts with the stored approximation result The approxvalue is compared to the current

argument list using the simple similarity relation If the application is ab orted the argument list of the

approximation is generalized by b eing merged with the current argument list cf section

If no approximation is in progress the match is computed First the set T of all tags of call cache

degree

t b e part of a lo op is determined This set consists of all op en applications of the current entries that migh

T

function within the currentcontext For each subset of T ie for eachelementof thesim is

degree

determined the details are deferred to the similar function The subset with the b est highest

group

degree of similarityischosen as T If this maximum similarity exceeds the match a lo op is

b estmatch threshold

detected and the exception is returned This exception contains twovalues the generalized argument

lo op

list of args and the CCE tag that corresp onds to the lo op entry p oint To determine this tag the

generalized

CCE tags have to b e ordered in time ie in order of their generation The earliest tag in T

b estmatch

corresp onds to the entry p oint

If no approximation is in progress and no lo op is detected a new CCE tag is allo cated the current arguments

and context are entered into the call cache and the function is applied using the applyfn If the result is

not exception the call cache entry can b e closed

lo op

If the applications result is of the form exception tag args a lo op has b een detected further

lo op entry

Chapter Abstract Interpretation of LF Expressions

down the call cache and tag denotes the CCE tag of the lo op entry p oint If tag do es not

entry entry

match the current tag the currententry must b e discarded and the exception handed up as result If the

tag matches the iterativeapproximation pro cess as sp ecied by loop starts

approx

loop applyfnftagCargs res

approx current current

let

t TagsCallCache

if t tag then CallCachet undefined

CallCachetag fargs Capproxres

current current

res applyfnfargs C

new current

fargs Capproxres CallCachetag

new current

in

if args args

current value new

res res

current value new

then

CallCachetag fargsCclosedres

new

res

new

else

loop applyfnftagCargs res

approx new new

loop takes ve paramters

approx

the semantic function that implements the application applyfn

the LF function for which an approximation is to b e found f

the tag that identies the call cache entry which is the ro ot of the approximation pro cess ie the

rst application of the function with the generalizable argument set

the currentevaluation context C

the argument list that was used to compute the last approximation args and

current

the result value that was generated by the last approximation res

current

The b ehaviour of the approximation lo op is determined by the convergence of argument list and result

approximation during the iterations If the results of two successive iterations have the same semantic value

and if the argument lists have not b een generalized by that last iteration a xed p oint has b een found

and the call cache entry can b e closed Note that the abstract values will not b e identical since nonvalue

b earing attributes like atomcode maychange only the semantic contents of the values and argument lists

are compared This is indicated bythe comparison op erator

value

Concrete Arguments

Calls containing only concrete arguments deserve sp ecial attention Esp ecially in the context of numerical

applications optimal compilation requires large amounts of compiletime computations The FFT algorithm

provides a go o d example it employs twidd le factors complex ro ots of that are inputinvariant and should

therefore b e tabulated in advance A typical sp ecicationstyle algorithm of an FFT algorithm mightnot

describ e the twiddle factors as a table of literals but will simply employ the arithmetic expressions that

dene them It is up to the compiler to decide whichvalues can and should b e tabulated in advance and

how to organize them

Esp ecially when computing the trigonometric exp onential and sp ecial functions standard metho ds heavily

dep end on iterations to rene their approximate results The similarity predicate used by the lo op detector

Chapter Abstract Interpretation of LF Expressions

func fragment func fragment

let let

x fak x fak

x fak x fak

x fak x fak

in in

xxx xxx

end end

Figure Two program fragments

should not consider these lo ops as similar otherwise one of the main goals of the compiler freeing the

user from explicit lo op unrolling will not b e met

The similar function employed in ac is tuned to avoid these problem It dierentiates b etween concrete

group

and abstract values When comparing small groups nonequal concrete values are treated as b eing dierent

only in larger groups comparison is done by base typ e

EvaluationOrder Eects

Due to the stack depth limittriggered generalization pro cess a new phenomenon o ccurs the evaluation

order inuences the precision of the abstract interpreter

Consider the approximation b ehaviour of the function fragment from g in the contextofaloop

detector that ab orts after a recursion depth of which just happ ens to b e the generalization threshold

The computation of x would b e approximated since the call depth of exceeds the lo op limit The result

would b e anInt and a call cache entry mapping fakanInt anInt would b e established The subsequent

calls to fak would therefore hit the cache and return anInt too

Now consider the reordered sequence in fragment

The computation of x hasadepth and will therefore terminate with a concrete value Furthermore

call slots will b e lled with the results for fakfak The computation of fak will therefore hit

the cache at fak and terminate Likewise the computation of fak will hit the cache at fak

and deliver a concrete result

As the example shows b oth the precision and run time of the abstract interpreter dep end up on details of

the evaluation history This is not a problem that concerns the correctness of the abstract interpreter but

its eciency and precision and the eciency of the nal generated co de

ALDiSPsp ecic Problems

State and Side Eects

A problem that has b een ignored up to now is the state While the matching of calls against callcache entries

is determined by an arbitrary similarity function for the argument lists exact matches are required for the

function and context What happ ens when a recursive function changes the context either by p erforming a

side eect or bychanging the exception environment This happ ens rather often eg in streampro cessing

functions

func streamFromPortpaPort readpstreamFromPortp

Chapter Abstract Interpretation of LF Expressions

which by automapping expands to the equivalentof

func streamFromPortpaPort

let

x readp

in

suspend xstreamFromPortp

until isAvailablex

within ms ms

end

The answer is that this function is not considered recursive at all That is it do es not call itself since it

only returns a closure that contains a reference to it After all the suspend form expands to a suspend

primitive function which installs a closure as a susp ension cf sections and To totally mix up

ALDiSP LF and the semantics

func streamFromPortpaPort

let

x readp

in

apply suspend

primitive

xstreamFromPortp

isAvailablex

ms ms

end

Equivalent things happ en to all sideeecting functions since each sideeect results in a synchronization

Promises

Still there are the nonreal side eects namely promise evaluation and reference lo okups whichmust b e

cop ed with Accessing previously evaluated references is a side eect that do es not mo dify the state and

can therefore b e ignored safelyEvaluating promises is somewhat problematic since a promise should only

be evaluated once The bruteforce metho d of guaranteeing this is to treat a promise like a sp ecialized

susp ension ie to blo ck the evaluation until the promise is available The scheduler is then extended to

handle promises sp eciallyby preferring them and not advancing the time The actual implementation

employs explicit up date and dereference primitives to mo del promises this is explained in detail in chapter

Recursive Denitions

Fixp oint detection happ ens at two places in the abstract interpreter The preceding discussion describ ed

the actual application of a recursive function ob ject to a given set of abstract argumentvalues A dieren t

problem is the creation of the recursive function in the rst place The semantics cf section assumes

a semantic primitive fix that computes the xp ointofany function This fix is then applied to the

semantic function that computes an environment from a set of declarations The direct iterative approach

of implementing fix has a severe p erformance problem even in the b est p ossible case the evaluation of the

declaration set takes place at least twice In worst cases fix can cycle forever eg when encountering a

denition without a nite xed p oint suchas

rec

ones consones

end

Even some simple cases for which a xed p oint exist can cause problems For example the evaluation of

Chapter Abstract Interpretation of LF Expressions

rec

ones consdelaylambda ones

end

do es not terminate each incarnation of the lambda will have a dierent lexical environment ie the

variable ones will b e b ound rst to then to the pair Promisen then to the pair Promisen

and so on Since unique promises are generated in each iteration equality can not b e guaranteed by

the comparison employed in loop

value approx

This eect is circumvented if the tag counter of promises is reset on each iteration of the fix function

since the same unique tags will b e allo cated for new promises if the iterations are structurally equivalent

ie if promise and susp ension tags are allo cated in the same order The need to explicitly reset the counter

is a consequence of the sideeecting nature of tags if the tag counter would b e carried along as a part of

the context it would b e reset automatically since the context of the iterated applications stays the same

There exists an alternative metho d of implementing recursiveenvironments which is based on mutable

cells In the semantics of LF reference values can provide this service A denition likethatof ones would

then b e evaluated as if it were of the form

n freshtag allocate tag

update n define reference n with some initial value

tag

ones refn preliminary value of ones

ones consones ie consrefn

update nones final value of reference n

tag

Exceptions

The exception environment is part of the context of evaluation Most functions do not redene any excep

tions but at least in theory it is p ossible to change the values of exceptions within lo ops Currentlyno

sp ecial treatment for such cases exists ie the lo op detector detects changes in the exception environments

and treats them as unequal values but it do es not attempt them

Function Values

Higherorder languages p ose a sp ecial problem for an abstract interpreter they allow treating functions as

values Function values are notoriously hard to handle since even the most basic op erations are not dened

on them for example it is in general imp ossible to compare functions for equality

The most immediate problem is exemplied by the following example

func fxy xy

func gxy xy

func hx if x then fx

else gx

endif

The if expression delivers a value that denotes one of two functions When called with a nonconcrete

argumentfor x the abstract interpreter has to nd the least upp er b ound of two functions In the ALDiSP

typ e hierarchy there is a typ e aFunction that denotes all functions but an abstract value that would

denote only any function cannot b e considered precise enough since any application of it would haveto

return a result including a top state something which the current abstract interpreter is unable to

handle

It is not necessary that the cells b e arbitrarily mutable it is enough that they can b e up dated once The cells havean

initial value of which is then replaced by a more sp ecic value I know of no language that supp orts such an up date concept

This is another p ossibility to drive ac into nontermination

The state is not a data ob ject but a heterogenous collection of data ob jects maintaind in semanticslevel data structures

To allow abstract states and much more useful partially abstract states a typ e StateObjwould havetobeintro duced

While p ossible in theorythiswas considered to o much trouble

Chapter Abstract Interpretation of LF Expressions

There are a few approaches to merging functionvalued ob jects

Arbitrary sets as explained in the b eginning the most general abstract value concept is that of sets

of arbitrary values As long as these sets stay small using them is p ossible Nothing forbids sets of

functiontyp ed values

Onthey co de generation when the functions that are to b e merged take the same number of

arguments a new function can b e generated In our example this would corresp ond to a transformation

of the function h into

func hx tmp

if x then fx

else gx endiftmp

If the functions do not take the same numb er of arguments as in

func ax

func bxy

if then a else b endif

they can b e overloaded

func ax

func bxy

overloadab

A conditional is not necessary to distinguish b etween the dierent cases the dierent arities are

sucienttodisambiguate any application of the overloaded result function

Closure generalization if the functions that are to b e merged are dierent closures derived from the

same lamb da expression only their environments dier Since environments are bindings of variables

to abstract values they can b e merged A new closure tag has to b e allo cated for this merged function

so that the comp onents can b e sp ecialized separately A problem of this approach is that co de has to

b e generated that implements the op eration

closure if cond then closure

new true

else closure

false

end

This cannot b e directly represented in the backend representation since closures are not represented

as rstclass values They are instead mo delled as environmen t tuples held in a global table and

indexed by the closure tag To mo del the conditional closure assignment a new primitivemust b e

intro duced for example

conditionalcopyclosuretag tag tag cond

new t f

Prepro cessing purely syntactical lo cal transformations can remove some o ccurrences of this problem

in advance This amounts to recognizing some typical sp ecial cases

As there exist only a nite numb er of lamb da expressions in each program the closure generalization

approach can b e taken to its extreme bywritinganoverfunction that encompasses all other functions

Its rst argumentwould b e a selector that indicates the function to b e computed For an example consider

the three functions

func fa ga

func gb hb

func hc c

They can b e merged into one function that takes an extra selection parameter

Chapter Abstract Interpretation of LF Expressions

func fghxabc

if xf then fghgabc

xg then fghhabc

xh then abc

There are some problems due to the handling of dierent arities but these can b e avoided by using tuple

values instead of argumentlistsorby lling up all argument lists with dummy arguments

The implementation of ac implements function merging by onthey co de generation Attempts to merge

functions of dierentarity are agged as errors

Generalizing Numeric Typ es

In ALDiSPnumeric typ es are parameterized with their width A consequence of this is that the base

typ e of a numeric literal is not unique For example can b e an nBitInteger of any width or

an nBitCardinal of anywidth For the similarity measure that is built into the lo op detector this

raises the question whether typ e matches have to b e exact or not For example are nBitInteger and

nBitInteger to b e considered nearly the same If this is the case is nBitInteger nearly the

same as nBitInteger The current implementation demands exact matches on numeric typ es but

has sp ecial provisions for literals cf section

Raising and Catching Exceptions

Exception values in ALDiSP are b ound dynamically When an exception is raised and returns it travels

up the call stack and is caught at the rst matching guard declaration Since there is no explicit call stack

in the semantics the diverse semantic functions have to co op erate to ac hieve this eect At the LF level

the ALDiSP guard construct has b een separated into an exceptionenvironment declaration and an Lcatch

expression

While it is easy to implement the exception binding b ehaviour it suces to intro duce an extra environment

comp onentinto the context together with the appropriate primitive functions the raising and catching of

exceptions creates some problems

In the standard semantics raised exceptions are mo delled by exceptiontag obj state tuples All semantic

rules co op erate bychecking all subresults for exceptionness These checks are implemented in the deref

ob j

function Only the Lcatch form can transform exception tuples into real results

In the abstract interpreter these exception tuples cannot b e treated like ordinary values Extending the

abstract value domain with sp ecial forms exceptionOrValuetag obj state realValue would indirectl re

intro duce general set abstract domains with all the related eciency and termination problems since

multiple exceptions require exception value sets An expression suchas

switchInt c

of case

case raise Overflow

case raise Round

case raise StupidExampleGotc ha

end

would have to b e represented by listing all three p ossible susp ension tags plus their dierently typ ed and

sized tuples

It hardly makes sense to complicate the very basics of the abstract interpreter just to guarantee a correct

handling of exceptions Instead a hack is used

Exceptions will hardly o ccur in the inner lo ops of realistic realtime co de anyway they are mostly needed to deal with onetime initialization and IO problems

Chapter Abstract Interpretation of LF Expressions

First whenever an exception value is merged with a real result value it is treated like and completely

ignored

A catch expression is treated almost like an identity function by the abstract interpreter but the return

value is generalized to its base typ e This follows from the premise that an exception handler will return a

value of the same typ e as that whichwould b e returned if no exception had o ccured

The abstract interpreter thus b ehaves as if dynamic o ccurence of an Lcatch might actually catch an exception

The result is generalized since there is no information available ab out the caught ob ject It can b e assumed

that the caught ob ject has the same basic typ e as the regular result so a simple generalization up to the

base typ e suces The generalization is needed for correctness as can b e seen from the following example

guard if x then stopx else end

in stopx return x

end

If the guard were treated as an identity function the result would b e regardless of the value of x Because

of the generalization an Integer is returned

A more sophisticated strategy can b e implented bykeeping track of a global exception state indexed by

the exception tags Up on encountering an Lcatch the resp ective slots of the exceptions that can b e caught

by it would b e cleared the mkexc primitivewould store the state and ob ject of the exception in its slot so

that a typ esafe merge can b e done up on return to the Lcatch In the case where no exception has o ccured

no generalization is necessary If more than one exception for a given slot o ccurs the resp ective ob jects have

to b e merged

There is a small problem in this example the typ e of x should b e used to determine the typ e of since the latter is a

literal and therefore of malleable typ e

Chapter

The Abstract Scheduler

The imp erative asp ects of ALDiSP programs are dened in terms of a state mo del in which a global

scheduler controls which susp ensions to evaluate and when The scheduler also controls all IO op erations

It determines the order of susp ension evaluation as well as the relation of this order to the virtual time The

scheduler is part of the compiler it has no access to the actual execution times

The scheduler exerts its controloftimeby issuing timeadvance state transitions In the co de that is later

emitted each timeadvance is translated into a wait instruction that is intended to synchronize the instruction

stream with a real time clo ck All IO op erations are p erformed at p oints of timeadvance ie all IO

events that have accumulated b etween two timeadvances are p erformed simultaneously at the latter one

Each timeadvance is therefore a synchronization p oint

The abstract scheduler diers from the standard semantic scheduler in in that it also intro duces renamings

between similar states at dierent p oints of time and thus collapses the tree of all p ossible state transitions

into a nite graph The set of state transitions that form the edges in this graph is later used to provide the

skeleton of the emitted co de cf chapter

Susp ensions and other Schedulable Entities

The standard semantics of the scheduler is given in section as part of the denition of eval

program

Here its implementation shall b e outlined

In the ALDiSP language mo del susp ensions themselves do not o ccur as semantic entities of computation

Instead references to susp ensions are employed References or reference objects for they form a subset of

the domain of values Obj havemultiple uses they may also refer to blo cks promises and events The state

consists of a set of denitions for these references In the current implementation there are the following

typ es of reference denitions semantic domain RefDef

An Ususp is an unevaluated susp ension The evaluation of a suspend form in LF the suspend

primitive function generates a reference that has a Ususp denition

A Wsusp is a waiting susp ension It contains a time frame ie information ab out the interval of

time during which it has to b e executed When the scheduler determines that the condition part of an

Ususp has b ecome true it is transformed into a WsuspEach following timeadvance decrements the

time frame counters in all Wsusps

An EvRef is an evaluated reference ie a value After a Wsusp or an Uprom has b een evaluated an

EvRef holds the result

Chapter The Abstract Scheduler

A Block is the result of a blo cking access to a reference Blo cks can b e treated as sp ecial cases of Ususps

that have a condition of the form isAvailablex and a time frame of Blo cks dier from all

other reference denitions in that they contain semanticslevel continuations instead of parameterless

closure ob jects

A Uprom is an unevaluated promise The rst access to it forces its evaluation generating an EvRef

Besides these computational reference values there are pseudosusp ensions used to mo del IO These have

the form Eventdevice kind obj Thedevice names the register or p ort that is accessed by the event The

obj is the ob ject to b e written or in case of a read or test op eration a dummy There are three kinds of

event

An Input event is generated from primitive functions like readPort If the input device is a register

the input is p erformed at the next timeadvance inputs from a p ort device are p erformed at arbitrary

times After the event has b een p erformed the Input is replaced byanEvRef pointing to an

abstract input value Until then all accesses to the reference p ointing to the Input will blo ck

An Output event likewise mo dels an output ie a true side eect Output to registers is p erformed at

the next timeadvance output to p orts maywait for an arbitrary amount of time The return v alue of

an Output event isadummy it can b e used using the isAvailable primitive to detect whether the

output transaction has b een p erformed or not

A Test event tests the accessibility of a p ort it evaluates to either true or false without blo cking

While such a test could b e mo delled by a nonblo cking primitive since the accessibility of an IO p ort

isapropertyofagiven state it is b etter mo delled as an event so that the decision is lifted to the

scheduler level In the abstract scheduler such a decision will fork the set of successors to a state

into those in which the test return true and those in which it returns false

The core of b oth the standard and the abstract scheduler is the succs function that given a state A

computes the set of all its successor states

Dep ending up on the contents of a state none one or a large numb er of the following state transitions may

b e used to deriveavalid successor state

An Ususp with condition evaluating to true can b e advanced to a Wsusp

An Ususp with condition evaluating to aBool can b e assumed to return either true or false

tpoint of virtual time can b e evaluated If the A Wsusp with a timing interval that includes the curren

upp er limit of its timing interval is it must be evaluated b efore the next timeadvance since now

is that last chance to evaluate it

Time can b e advanced all p ending Input OutputandTest events are then performed This may

generate more than one result state if there are asynchronous events present for which the actual

execution time can not b e determined at compile time

The only real restrictions imp osed on the scheduler are that Ususps should b e advanced to Wsusp status

as early as p ossible otherwise the sp ecied timing interval starts to o late and that no timeadvance is

p ossible when there is a Wsusp that must b e scheduled at the currentpoint of time

Arbitrary means that the programmer has no guarantee ab out when the event is p erformed IO that employs asyn

chronous registers is transformed into nondeterministic state transitions when a state contains k asynchronous events at the

k

moment of its timeadvance there will b e successor states with all dierent combinations of p erformedstill p ending events

n

Otherwise each abstract timeadvance would havetocreate successor states where n is the numb er of IO p orts that

are referenced anywhere in the program With Test events these states are constructed ondemand

Chapter The Abstract Scheduler

NonDeterminism

There is a lot of nondeterminism in the abstract scheduler esp ecially where the evaluation of Wsuspsis

concerned When a Wsusp has a nonp oint range ie a range of the form m n m n itmaybe

nm

evaluated at p oints of virtual time where the time is assumed to b e quantized by some duration

t

t

Furthermore even if all ranges in a program are p ointlike there may b e situations in which more than one

Wsusp can b e evaluated at one p oint in time so that the scheduler has to cho ose an order

It would b e nice if a program that has a welldened IO b ehaviour ie in which no to output op erations

haveoverlapping p erformance intervals would act the same regardless of the ordering that the scheduler

imp oses on it Nondeterministic programs that do not employ IO are however simple to write Consider

the following expression whichmay return either or

let

s suspend until true within ms ms end

s suspend until true within ms ms end

in

suspend if isAvailables then s else s end

until isAvailables or isAvailables

within ms ms

end

end

It would b e p ossible to let the scheduler enumerate all p ossible states A later phase could then guided by

a cost function cho ose a particular execution path and warn the user ab out nondeterministic b ehaviour

The problem with such an exhaustive approach to nondeterminism is the size of the state space it would

grow rapidly If the program is IO deterministic the growth is less than exp onential since there are

synchronization p oints at which similar states will b e merged together again

An example is shown in g which graphically represents the state graph of the following program

func demo

let

s suspend until true within ms ms end

s suspend until true within ms ms end

s suspend until true within ms ms end

in

suspend

demo

until isAvailables

or isAvailables

or isAvailables

within ms ms end

end

The scheduler can cho ose b etween any of the susp ensions sss at rst and any of the remaining three

afterwards Scheduling sequences like ss and ss lead to the same result s and s evaluated the

others not evaluated and can b e shared This sharing is expressed via a renaming lo op transition

Sequentializing the Schedule

A lot of the nondeterminism can b e eliminated at compiletimebycho osing an arbitrary scheduling order

This early ordering minimizes the fanout of states and thus cuts down the size of the state graph The

The state space diagram was generated directly by ac via the graph option The output is a graph description that was

layed out afterwards by the GraphEd to ol with minimal manual intervention

Chapter The Abstract Scheduler

state1 state3

eval/12 eval/14

state6 state4

eval/14 eval/13 eval/12 state7

loop

state12

eval/13 eval/13 eval/13

state13 state5

eval/14

state9

loop eval/12 loop

state11

root eval/12

state8 state14

loop adva loop state10

eval/14

state15

updt

state16

adva

state17

eval/15

state18

loop

state0

updt

state2 Figure NonDeterminism and Merging in a State Graph

Chapter The Abstract Scheduler

state graph takes on a more linear form and contains only those forks that are a semantical necessity

In the current implementation the transitions are sequentialized by the following rules

If there are evaluable Blocks they are evaluated Otherwise

If there are advanceable Ususps one is cho osen and advanced Otherwise

If there are evaluable Wsusps one is chosen to b e evaluated Otherwise

If there are assumable Ususps one is assumed with two result states Otherwise

If there are no UsuspsorWsusps at all there are no successors and this part of the state tree termi

nates Otherwise

Time is advanced

The only nondeterminism left is in the assumptions of Ususps with an unknown condition and the IO

assumptions that are intro duced when asynchronous IO is p erformed during the timeadvance

Garbage Collection

States can b e garbagecol lected at compile time Garbage collection GC disp oses of ob jects that are guar

anteed not to b e used byany future computation GC do es this by computing a conservativeapproximation

to this set of unneeded ob jects it enumerates all ob jects that are structurally accessible alive from

a known set of ro ots which are part of the current program state and eliminates all ob jects that are

inaccessible dead The ob jects of interest when garbage collecting an ALDiSP state are the reference

denitions

The ro ots ie the references that are assumed to b e alive are

all InputOutputTest events

all Ususps

all Wsusps

The only references that may b e removed by the garbage collection b ecause they can b ecome inaccessible

are thus EvRefsandUproms

Garbage collection is a necessary prerequisite for successfully nding lo ops since it removes old EvRefs

that would without GC accumulate Without this trimming states would growover time and never b e

considered similar by the state lo op detector since the no structural corresp ondence would b e found for the

dead EvRefs

States that contain Blocks cannot b e garbage collected since a blo ck encloses an opaque interpreter state

whichmay address arbitrary references This is another reason to minimize the number of Blocks generated

by the abstract interpreter

This greedy scheduling of Wsusps can b e replaced by a randomized placement strategy with a commandline option

Only Ususps that have a condition that can p ossibly evaluate to true at some time are really ro ots Since this condition

cannot b e tested it is ignored One waytobringac to its knees is therefore to pro duce susp ensions with a false condition

these will never b e advanced to Wsusp status nor removed by the garbage collector It might b e p ossible to determine whether

a condition dep ends up on the current state but such an analysis is probably not worth the complexity

Chapter The Abstract Scheduler

The garbage collector is implemented using a straightforward markandsweep algorithm b eginning with

the ro ots all reachable ob jects are enumerated All references that were not reached are removed Since

the garbage collection happ ens at compile time only no sp ecial tricks are needed to sp eed it up Tond

the references reachable from a given ro ot a function has b een written that enumerates all subvalues of a

value While this is easily done in the nonabstract value domain abstract values p ose some slight technical

challenges Abstract values can contain attributes that are themselves values or contain references to values

For example the atom and code attribute intro duced in chapter maycontain literal values whichcanbe

reference ob jects A co de walker is needed to nd these references Also the compiler phase that generates

these co de fragments which is mostly the reconstruction phase cf chapter ought to minimize the amount

of ob jects that are touched in a piece of co de

Garbage collection is a timeconsuming pro cess Tominimize these compiletime costs collections are only

p erformed after each timeadvance

Finding Lo ops

Beginning with the initial state a tree of states will b e enumerated by successive application of the succs

function For most DSP applications this tree will b e of unb ounded size representing a nonterminating

program b ehaviour

To create a nite schedule a loop detection scheme is employed that nds similar states There is a re

semblance b etween the state lo op detector and the function lo op detector that approximates xp oin ts in

the abstract interpreter but it is only sup ercial The current state lo op detector do es not involveany

approximation hence needs no iteration steps

A state A is similar to a state B if there is a valid renaming that maps A to B Arenaming between two

states establishes a structural corresp ondence b etween al l the data structures that make up the states To

representsuch a mapping data ob jects must have a unique identity Only closures and references havesuch

identities the state mapping is therefore represented as a mapping from tags to tags A renaming is valid if

eachvalue is mapp ed ontoavalue that is structural ly equalScalarvalues are considered structural ly equal if

they have the same base typ e nonscalar values ie data structures must also have the same shap e and

structurally equal comp onents Determining structural equalitythus b ecomes nontrivial when recursive

structures such as function environments are considered Functions are considered structurally equal if

they are derived from the same function denition this is easily determined by comparing their Lambda

expression tag and their environments contain structurally similar values

The abstract scheduler enumerates the state tree in a lazy manner It maintains a list of front states

whic h initially contains only the start state Eachscheduling step consists of computing the set succfront

of all p ossible successor states of all front states These successors are the new front of the state space

The new front is simplied bysearching for similar states within the front and amongst the previous states

When a numb er of states in the front are found to b e similar one is chosen as representative and the others

are mapp ed to it If a similar state is found that is not in the front a mapping to the earliest state is

intro duced This corresp onds to nding a lo op in the program

A large b o dy of literature describ es tricks and techniques to increase the sp eed and decrease the memory consumptionof

runtime systems and runtime data enco dings that supp ort garbage collection or one of its alternatives like reference counting

or linear logic

In general a traversal scheme for each attribute is needed in the actual ac only codeatom attributes maycontain values

A later version might incorp orate a state generalization scheme as a means to further reduce the state space

This amounts to partitioning the front set into a set of equivalence sets

There maybemorethanonesuch state in the history of the graph in which case the inb etween states will already b e

connected to the earliest similar state Lo oping back to the earliest state instead of an arbitrary earlier state removes jump

chains

Chapter The Abstract Scheduler

Each time a mapping is found one entry is deleted from the front When the front nally b ecomes empty

the abstract scheduler terminates There is no guarantee for this to happ en so maximum sizes can b e

intro duced for the state space and the current front to force a termination Such a size limitation has b een

proven to b e a necessary debugging aid

After the scheduler has terminated the state graph is given to the reconstruction phase cf chapter

which recreates the program in the form of executable co de

As with garbage collection lo op detection do es not work on states that contain Blocks A state space in

which all states contain one or more Blocks can therefore not b e transformed into a nite graph

Alternative States

To simplify the abstract interpreter the design decision was made that each run of the interpreter must

pro duce exactly one new state This is imp ortant if one considers how to treat an expression like

let runTimeVal readinPort

in

if runTimeVal then suspend until within end

else suspend until within end

end

end

Dep ending up on the runtime value of runTimeVal whichisanaBool in the abstract interpreter one

susp ension or the other is to b e evaluated Quite dierently shap ed states may emerge from such a situation

There is no obvious way to create an abstract state value that describ es these alternative The place to

handle this decision is the scheduler which is already equipp ed to handle assumptions

Therefore a new reference typ e IfSusp is intro duced that mo dels this b ehaviour an

IfSuspcondexpr expr

true false

will b e returned to the scheduler level The two expr s are adho c closures that enco de the alternatives of the

conditional The mo dications to the scheduler are minimal only the succs function has to b e extended

with a rule similar to that needed for assumable Ususps

Blo cking

One allp ervading b ehaviour of ALDiSP is the automatic dereferencing of references and the resulting blo ck

ing b ehaviour that o ccurs every time an unavailable reference is accessed For example

suspend until true within ms ms end

will create susp ensions that are linearly dep endent up on the innermost expression The resulting state

space and the generated co de are equivalent to that whichwould b e generated for the explicit variation

On a Sparc with M memory a run that enumerates states can easily ll up all the core memoryItwould b e useless

to pro ceed after this happ ens as thrashing reduces the p erformance to an unacceptable level

Chapter The Abstract Scheduler

let

tmp suspend until true within ms ms end

in

let

tmp suspend dereftmp

until isAvailabletmp within ms ms end

in

let

tmp suspend dereftmp

until isAvailabletmp within ms ms end

in

suspend dereftmp

until isAvailabletmp within ms ms end

end

end

end

While this b ehaviour is ne for sp ecication purp oses it uses a lot of resources Atthescheduler level there

will b e four states involved in the evaluation of this expression the evaluation of each susp ension necessitates

one state transfer Each of these states will b e compared with previous states to nd lo ops and will b e the

target of comparisons with all future states

It is clearly necessary to minimize the impact on blo cking b ehaviour on the abstract interpreter by reducing

suchchains of blo cks For example transformation of the example expression to

let

tmp suspend until true within ms ms end

in

suspend dereftmp

until isAvailabletmp within ms ms end

end

would approximately halvetheevaluation costs The currentinterpreter contains two adho c optimizations

that will catch some situations like these The rst optimization is lo cated at the top of the eval function

expr

Whenever an eval eC evaluates to a blo ck that waits for a susp ension already presentin C the result

expr

is discarded and an explicit susp ension is created instead The second optimization do es essentially the

same for sequences

At the interpreter level the problem is one of software complexity and mo dularity the need for forced

values ie values that are guaranteed not to b e references o ccurs frequently

When applying strict primitive functions some functions like isSuspended or cons are nonstrict

When applying a function the function itself has to b e known to b e applied

when testing the condition of an if expression

when testing for a typ e esp ecially when resolving an overloaded function application

So as to centralize the handling of blo cking b ehaviour in the interpreter it would b e nice to haveafunction

force that is guaranteed to returns a real value Force will havetointro duce a call to deref if the

value is a reference to an EvRef this is the simple part and it will evaluate Ususps and replace them by

EvRefs But what is to happ en if the reference p oints to a real susp ension Luckily the interpreter is

implemented in SML which supp ort exceptions force will raise an exception and ab ort and the exception

That also includes IO events

Chapter The Abstract Scheduler

will b e handled as late as p ossible ie at the earliest p oint of time p ossible What is this earliest p ointof

time Obviously it should b e a p oint where it is reasonably easy to insert a susp ension b est a p oint where

it is done anyway One of these p oints is the if handler which will have to raise a IfSusp if the states

that result from the arms of the if dier Another place has to b e the p oint where the susp ension that

causes the blo cking is dened ie the piece of co de that mo dels function application Indeed the blo cking

exception may not backtrackbeyond any p oint where the state has b een changed since then the state that

surrounds the blo ckwould b e dierent from the state in whichtheblockishandledwhich could makeit

useless

The correct handling of blo cks can b e develop ed by lo oking at all p ossible cases

Lsimpledeclpos symbols expr Here the blo ck can o ccur when evaluating the expression The fact

that the blo ck o ccurs here implies that the evaluation of the expression did not intro duce a side eect

Therefore the blo ck exception can b e passed on ie nothing has to b e done at all

Llocalpos decls this is translated into LetLsimpledecl combinations

Lambdapos tag params body inanaiveinterpreter no susp ensions can o ccur here De facto they

can since the typ e expressions in the parameter declaration list are evaluated at this time so as to

prevent unnecessary multiple evaluation Typ e expressions are restricted to b e sideeect free so it is

an error if a Blo ck o ccurs here

Lvarpos sym Variable access cannot blo ck

Litvalue Literal access cannot blo ck

Lapppos expr expr if one of the expressions blo cks the application is translated into the

n

canonical form

let tmp expr

tmp expr

n n

in

tmp tmp tmp

n

end

Lccpos expr expr if the expr evaluates to Ob j nothing happ ens at all this is a

type value type

consequence of the semantics and guarantees that nullLccs do not change the evaluation order Again

the exprtype itself is not allowed to blo ckortochange the state If the exprvalue blo cks the blo c k

can b e passed along since the Lcc would not change the state

Lcondpos expr expr expr ifexpr blo cks the blo ck can safely passed up If the result is of

expr is true or false expr or expr canbeevaluated in isolation the if can b e ignored If expr

is aBool b oth alternatives havetobeevaluated and merged If the alternatives return with diering

states an IfSusp will b e returned that was already mentioned If one alternative blo cks and the

other not this is considered to indicate dierent states to o If b oth alternatives blo ck the if is

translated into b e the canonical form

let

tmp expr

in

if tmp then expr

else expr

end

end

and the whole if is considered to blo ck

Chapter The Abstract Scheduler

Lselectpos expr alternatives default this is considered equivalent to a set of nested ifs

Lcatchpos strings expr this can simply pass the blo ckon

Letpos expr decl decl There maybetwo situations one of the decli may blo ck or the

n

expr may blo ck A blo cking decli may not blo ck the whole Let since it in naive interpretation

the resulting susp ension would b e b ound to a variable not blo cking the further evaluation at all Ergo

such a susp ension has to b e created and assigned to the resp ectivevariable b efore pro ceeding

The other p ossibility is that of a blo cking exprIftheevaluation of the decls did not change the state

the blo ck can b e passed on Otherwise if one decl did change the state here state changes due to

i

the intro duction of blo cks do not count the Let can b e splitted as

let decl

decl

i 

decl

i

in

let

decl

i

decl

n

in

expr

end

end

and a susp ension can b e wrapp ed around the inner LetThisway a minimum of workisdonebeyond

the last state change and blo cks that mighthave o ccured in decl decl may b e deferred

i n

Lastlyifthe expr blo cks and all statechanges done in the decls are due to blo cks the blo ck can b e

passed up

Lseqpos expr expr Sequences are used esp ecially for the mo delling of blo cking b ehaviour

n

if an expr returns a susp ension the rest sequence has to wait for its evaluation Since the very

i

intro duction of a susp ension will havechanged the state this has to b e implemented faithfullyas

equivalentto

let

tmp expr

in

expr suspend seq expr

nd

until isAvailable tmp

end

end

If a blo ck o ccurs during the computation of an expr usually the rst expression since all but the

last one will return susp ensions anyway and thus b e removed from the head of the expression list a

susp ension has to b e intro duced to mo del it

Chapter

Execution Trace Reconstruction

This chapter describ es how the program is reconstructed once abstract scheduling and interpretation have

nished

The abstract scheduler has generated a state graph the abstract interpreter has written a call cache The

state graph consists of state descriptions and annotated transitions b etween states the call cache contains

an entry for every function call made during the AI Each state of the state graph is characterized byaset

of reference denitions Eachcallcache entry contains the arguments to the function and the residual co de

that was generated by its execution

Starting with state graph and call cache the reconstruction phase recreates the program The reconstruced

program is represented in Code Form The Co de Form CF started out as a subset of the Lambda Form

but evolved into a separate intermediate representation The Co de Form is designed to make dataow

prop erties of the program explicit and to intro duce names for all temp oraries

The shap e of the reconstructed program is determined by the state graph Each state and statetransition

is mo delled by a function a state transition function s s that leads from state s to state s is called

x y x y

within the function that represents s and ends with a tailrecursive call to the function that implements

x

s The program as whole is the set of state and statetransfer function denitions amongst whichis

y

one distinct initial state

There are dierent kinds of state transitions cf chapter and a dierent kind of co de is generated for

each of them Transitions can b e purely administrativeUsusp to Wsusp advancement no co de involved

branches Ususp assumptions and IfSusps a conditional is generated evaluations of lamb da closures for

which a function application is generated or continuations of Blocks

The application of a function closure caused byaWsusp evaluation mayintro duce many reference denition

up dates as a side eect Reference denitions are initially mo delled by mo dications of one state variable

that is passed around as argument to the state functions eventually this one variableissplitupinto a

separate variable for each reference denition

Most of the reconstruction eort takes place during the abstract interpretation phase of the compiler The

abstract Result typ e used by the AI is extended with a code attribute This attribute contains a Co de

Form fragment which when executed at runtime will compute the concrete value which the abstract result

approximates

Later optimizations merge state and statetransition functions by including the b o dy of each statetransition function

s s in the co de of the state function s that calls it Also linear sequences of state functions are collapsed into one function

x y x

Conceptually it is simpler to consider the state and statetransition functions separated

There is also one distinct exit state generated for programs that have terminating state paths

Chapter Execution Trace Reconstruction

Since design considerations of the reconstruction stage determine the restrictions and conventions of the

Co de Form and thus those parts of the semantic functions that generate the code attribute these have

not b een mentioned during previous chapters Co de generation is orthogonal to the standard or abstract

semantics since it do es not change the execution b ehaviour

This chapter starts o by explaining the Co de Form and the conventions regarding closure and state repre

sentation A formal semantics is given An example follows that exercises the co de generator on a varietyof

tasks The resulting co de will contain many ineciencies A set of interfunction transformations inlining

grouping and tabulating and basic blo ck optimizations copy propagation reference and tupletracingare

explained next All transformations are applied to the co de that was generated in the example

A second example tackles the sp ecic problems of co de generation for Blocks and explains some heuristics

that minimize the costs of resuming a blo cked computation

The two examples cover all ma jor asp ects of co de generation as p erformed during the abstract interpretation

A description of the statefunctions follows these are not generated during AI but from the state graph

that is generated by the abstract scheduler

A concluding section contains a semiformal description of the co de generation tasks that are p erformed by

the semantic functions in the abtract interpreter

The Co de Form

The Co de Form CF has some prop erties similar to the ANormal Form describ ed in Both forms

have manycharacteristics of continuationpassing style CPS but av oid the use of explicit continuations

whenever p ossible In the rst few incarnations of the backend of the compiler a subset of the Lambda

Form was employed in the reconstruction and co de generation phase This made it necessary to distribute

many semantic validitychecks over the whole co de generation phase It was also cumb ersome to express

some concepts notably multipleoutput expressions and the explicit state variables in the context of LF

The intro duction of a new intermediate representation made it p ossible to enco de the semantic restrictions

of the p ostAI program directly into the structure of its data typ e

Dierences b etween CF and LF

When compared to the Lambda Form the main prop erties of Co de Form expressions are

The CF has no expressions to denote typ ecasting Lcastortyp echecking Lcheck

The CF has no sequential statement expressions Lseq

All arguments to function applications and the conditions of conditionals are atomic As a result

all intermediate results are b ound to variables and

all variables are of a known typ e

The intermediate results are b ound by declaration sequences that corresp ond to LetLseq

combinations in LF

This is the most imp ortant dierence to CPS where intermediates are passed on as arguments of continuation functions

Chapter Execution Trace Reconstruction

State is made explicit by passing state variables to those function that access reference values and

by returning a new state from those functions that mo dify it The state variables are singlethreaded

there exists only one living state variableatany p oint of time Once a new state variable has b een

intro duced the preceding one is never again referenced The primitive that implements reference

lo okup takes the current state variable as argument the primitive that implements reference up date

takes the current state variable as argument and generate a new state that is immediately b ound to

a fresh variable

All declarations are lexical

Those dynamic exception declarations that cannot b e mo delled by function sp ecialization are passed

around as part of the state

The Co de Form do es not contain any equivalent for Lambda expressions All function denitions are

installed at the top level ie their free variables must b e part of the top level The evaluation of a

Lambda expression is mo delled by the creation of a tuple that holds the lexical variables needed when

the function will b e applied To apply a function b oth this tuple and the name of the global function

are needed

The denition of atomicity is based on the numb er of reduction steps needed to evaluate an expression If

for a given expression the numb er of reduction steps is guaranteed to b e less than k that expression is called

k reducible Variables and literals are reducible Primitive function calls with n reducible arguments are

considered n reducible Applications are estimated to b e nm reducible where n is the reducibility

of the arguments and m the reducibility of the called function The reducibility of a conditional statement

is the maximum reducibilityofeach of its paths All expressions that contain no function applications have

a nite reducibility If an expression contains an application reducibility cannot in general b e statically

determined since the evaluation may not terminate A nonterminating expression would haveevaluation

time

In ac terms are deemed atomic if they are reducible

Abstract Syntax of CF

The abstract syntax of the Co de Form is as follows The code attribute of each abstract result contains a

Code expression

datatype Code Code of declsDecl list

exprsAtom list

stateStateBehaviour

A Code expression consists of a list of declarations a list of atomic return values and a description of the

state b ehaviour A Code expression is the direct equivalenttoaLFLet expression

datatype StateBehaviour

SBNIX

SBUSE of Var

SBDEF of Var Var

Since it is not known in advance whether a given function mo dies the state the code of all functions is initially generated

with state arguments and state results A later p ostpro cessing optimization removes all unnecessary function parameters

A reduction step is dened as one application of eval

expr

The was chosen b ecause under most function call conventions a call involves at least a jump a lo cal register setup

push a lo cal register restore p op and a return jump

Using this extreme denition syntactical means can b e employed to guarantee that all primitive and function arguments

are atomic The other extreme would b e to consider all expressions of nite reducibility atomic this would only prohibit nested

function applications but allow all other kinds of nested constructs

Again an abstract syntax is identied with its SML data typ e denition

Chapter Execution Trace Reconstruction

The compiler employs auxiliary functions to safely nest comp ose and lo cally optimize code expressions

To simplify these tasks some additional b o okkeeping information is provided to rememb er the name of the

input state variable and the output state variable of each piece of Code A Code expressions either

ignores the current state SBNIX reads it SBUSE or reads and mo dies it SBDEF

datatype Atom

AVar of Var

ALit of SVT

Atomic expressions are either variable lo okups or literal values SVT is the typ e of semantic values which

corresp onds to the domain Obj of the standard semantics cf section

datatype Expr

Atom of Atom

Prim of string Atom list

Select of Atom SVT Code list Code Option

App of Var Atom list

Throw of int Atom list

Catch of int list Code

Func of Var list Code

Expressions are either atomic conditional or function calls Exception handling is mo delled using explicit

catchthrow forms There are two dierenttyp es of applications Prim applications apply primitives iden

tied by strings to argument lists closures are applied using App For the co de of the called closure

App expressions do not b elong to the LF expressions asso ciated with closure ob jects but to the CF Code

expressions asso ciated with the call cache entries CCEs maintained by the abstract interpreter Each CCE

is identied by a tag this tag is referred to bytheinteger argumentofthe App

Finally there are function expressions These corresp ond to Lambda expressions in LF but are restricted in

that they may not contain nonglobal free variables

datatype Decl

Decl of vars Var list exprExpr

A declaration binds the return values of an expression to a list of variables

CF variables are typed to facilitate certain p ostpro cessing optimizations esp ecially variable splittingcomp ound

breakup cf section The typ es form a subset of the basic typ es of the semantics from the co de

generation p oint of view the typ es are generated automatically eachCFvariable is created within a context

in which the abstract value it is to hold is known the ac pro cedure that allo cates fresh variables generates

atyp e annotation out of this abstract ob ject The typ es will not b e mentioned in the further exp osition of

co de generation

Notation

In the following examples an abbreviated notation for Code fragments is used Instead of

Codedeclsd d exprsv v stateSB

n k something

a declaration list is written followed by list of return values the two elements are separated byavertical

line and enlosed in brackets

d

d

n

v v

k

The state b ehaviour is not explicitly noted

This information could also b e extracted bydataow analysis from the declarations The explicit form was chosen b oth

to sp eed up common op erations and to simplify debugging

All tags are implementedasintegers so that unique tags can cheaply b e generated bycounters

Chapter Execution Trace Reconstruction

Closures Tuples Combinators

LF function closures consist of three elements a lexical variable environment a list of typ ed parameters and

a b o dy expression From a given function ie LF Lambda expression anynumb er of dierent closures

can b e instantiated The only dierence b etween these instances is the variable environment parameter list

and b o dy expression are xed It is therefore sensible to separate the runtime representation of closures

into two parts a closure tuple and a combinator

Tuples are datastructures with a xed numb er of comp onents of arbitrary typ e In the following a convenient

notation for comp onent access will b e employed a closure tuple is created using the application of the

mktuple primitive to a list of variable names ie mktuplename name ift is b ound to a tuple

n

tname retrieves the value of name In the actual co de the names are replaced by indices and a

i i

lo okup function lookuptupletuple index accesses the tuple elds

Combinators are functions that do not contain free variables they are pure functions that can only combine

their arguments A combinator can b e transformed into a sequence of machine instructions applying the

combinator corresp onds to calling these instructions as as subroutine To translate a function into a

combinator its parameter list is extended by a formal parameter that will b e b ound to the tuple that

represents the lexical environment ie the closure All o ccurrences of free variables in the b o dy of the

function are then mapp ed to lookuptuple op erations on this tuple parameter

An alternativewaytoconvert functions into combinators is to provide each needed lexical variable as an

explicit parameter instead of bundling them all together in one tuple This is not done in ac b ecause it

would greatly increase the b o okkeeping eort in the AI phase of compilation it is b oth easier to implement

and faster to split the tuples later on when the resident program is created and other optimizations have

already minimized the co de size

TopLevel Bindings

Most programming languages make a distinction b etween toplevel declarations or global variablesand

lo cal declarations Languages like C do not even allow lo cal declarations for some entities functions typ e

declarations Even Scheme makes a distinction b etween toplevel bindings and other bindings

define and set are indistinguishable at the top level

What makes toplevel ob jects interesting is that they can b e assumed to b e xed ie they do not change

their address In pure functional languages toplevel ob jects are also constant ie they do not change their

content In contrast lo cal bindings usually exist in the context of functions their lifetime is restricted to

one invo cation of this function and they haveanewvalue and stack address up on eachnewinvo cation

Toplevel bindings can b e allo cated statically at compile time they need not b e deallo cated Global v ariables

can b e treated as literals in most of the co de since they are b ound to an unchanging value It is therefore

not necessary to include them in closure tuples Program analysis and implementation is simplied to a

great extent if all functions reside at the top level

All CF function declarations generated from call cache entries are lo cated at the top level Also literals that

are to o big to b e implemented as immediate values will b e represented under toplevel variable bindings

Array and listvalued literals fall under this ruling

Semantics of the Co de Form

The formal semantics for the Co de Form is much simpler then its Lambda Form equivalent The most im

p ortant dierence b etween them is that the Co de Form semantics mo dels exceptions by failure continuations

The typ es of the parameter list mayalsochange they are not restricted to b e xed Apart from overloading considerations

typ es can b e considered part from the b o dy they can b e converted into Lchecks

If the combinators are in CPS form applying acombinator corresp ondsto jumping to the start of its instruction sequence

Chapter Execution Trace Reconstruction

This makes it p ossible to sp ecify exceptions in a lo calized fashion in the Lambda Form semantics each

semantic function has to verify that the results of its subfunctions are true values if they are not ie

they are exceptional the semantic function needs to b e ab orted These details were partially hidden by the

dereference rules that were needed to cop e with reference ob jects Since dereferencing is made explicit in

the Co de Form exception checking would surface in the semantics if mo delled in the old way

Each semantic function takes two continuation functions the success continuation is called with the normal

result of the semantic function the failure continuation is called when an exception o ccurs

The second dierence b etween the Co de Form semantics and the Lambda Form semantics is the treatment

of state Co de Form expressions treat the state likeany other data ob ject co ding convention guarantees

that state usage is singlethreaded All computations are done on the semantic set Obj which includes

CF

state values

Obj Obj State Functions

CF CF CF

State tag RefDef

CF

Functions functionVar list Code Env

CF CF

Env Var Obj

CF CF

The Obj and RefDef typ es are b orrowed from the Lambda Form semantics cf c hapter Instead of

Lambda Form closure ob jects function ob jects are used

The semantics of a CF program is determined by the semantic function eval whichisoftyp e

CFprogram

eval Decl list State

CFprogram

The evaluation functions eval eval and eval describ e the semantics of Co de Form

CFexpr CFdecl CFco de

expressions declarations and Code pieces resp ectively They are of typ e

eval Env Cont Obj list State Expr State

CF Fail CF

CFexpr

eval Env Cont Obj list State Code State

CF Fail CF

CFco de

eval Env Cont Env State Decl State

CF Fail CF

CFdecl

All evaluation functions take a lexical environment and twocontinuation functions as context parameters

All failure continuation are of the same typ e

Cont tag Obj list State

fail CF

Success continuations are of typ e x State where x is the result of the evaluation function the success

continuation is called with this result when no exception is raised Expressions and Code pieces evaluate to

ob ject lists while declarations evaluate to environments

Since the exceptions have b een isolated all evaluation functions return or pass on valid ob ject lists instead

of Result s

In the following semantic environments are named E success continuations are denoted K and failure

continuations are named F

Programs

A CF program is a list of declarations one of whichmust dene a function called init

An alternative semantic mo del that was considered employed the notion of explicit stacks instead of continuations stack

p ositions would then b e recorded raising an exception would corresp ond to jumping up the stack Such a mo del reects

actual implementation practice but do es not simplify formal reasoning Likethecontinuationpassingsemantics that is now

employed a stack mo del needs to intro duce lab els ie continuation functions to put on the control stack so the semantics

wouldnt b e any smaller

If they return at all that is The Co de Form cannot guarantee terminationeven though the abstract interpreter must

terminate to generate the Co de Form program

Chapter Execution Trace Reconstruction

eval decl decl

n

CFprogram

eval E Fail Term Appinit closure state

global global global init init

CFexpr

where

E E E

global functs static

E namefunctionparamscodeE

functs global

idecl DeclvarsnameexprFuncparamscode

i

E namevalue

static

idecl DeclvarsnameexprALitvalue

i

Fail args erroruncatched exception

global

Term state state

global

closure E init

init functs

state

init

The global or static values must b e literals such as tables The static values and the functions comprise the

global environment each function has the global environment as its closure The actual program is run

byinvoking the init function with the initial state an empt y reference denition set Up on successful

termination the Term continuation will b e called with the result state

global

Atoms

The denition of eval employs an auxiliary function eval for atomic subexpressions

CFexpr CFatom

eval EKFAtoma Keval Ea

CFexpr CFatom

eval EAVarvar Evar

CFatom

eval EALitlit lit

CFatom

There is no need to pass continuation functions to eval since atomic expressions cannot raise ex

CFatom

ceptions It can b e syntactically guaranteed that that all variables are b ound therefore the variable lo okup

Evar cannot fail

Primitives

The b ehaviour of primitives shall not b e dened here ie appy is left unsp ecied it suces to say

CFprim

that all primitives are pure functions it is not p ossible to access the call cache or mo dify global state using

a primitive The essential primitives without which execution would b e imp ossible are describ ed in the

examples when they o ccur for the rst time

Most of the other primitives have the same semantics as under apply cf chapter

primitivestrict

eval EKFPrimsarg arg

n

CFexpr

Kapply seval Earg eval Earg

n

CFprim CFatom CFatom

Note that there is no order of argumentevaluation implied Atomic expressions cannot generate side eects

therefore the order of their evaluation need not b e sp ecied

Note also that primitives cannot fail there is no p ossibility to raise an exception from within a primitive

This b egs the question of how to mo del arithmetic primitives that can raise exceptions such as division which can

raise exceptions on underow overow or divisionbyzero In the ALDiSP arithmetic standard librarysuch op erations are

implementedby explicit tests Tokeep with the division example each use of the primitive division op erator is preceded by tests

for p ossible underows overows and zeroness of the second argument In anysuch case an exception is raised excplicitly

The reconstruction pro cess will recreate these tests in the Co de Formif the abstract interpreter has not b een able to remove

them as dead co de If the target machine supp orts an op eration that p erforms a division and generates extra exception

ags the backend co de generator cf chapter has to detect the opp ortunity to utilize this op eration to implement the more

general approach This is an instance of an idiom detection technique

Chapter Execution Trace Reconstruction

Conditionals

The Co de Form supp orts only one very general conditional It resembles the Lselect of the Lambda

Form

eval EKFSelectakey code key code code

n n default

CFexpr

if i key key then eval EKFcode

i val i

CFco de

else if code NONE then eval EKFcode

default default

CFco de

else error

where

key eval Ea

val

CFatom

The default is optional so that no dummy co de is needed if the compiler knows that the keys are exhaus

tive

No ordering is implied in the Select form if twokeys are identical either can b e chosen

Function Application

Function applications follow some conventions The rst argument is the functions closure the last argument

is the state variable if the function takes a state argument In b etween are the real arguments If the

function returns an altered state it will b e the last element in the return value list

The closures have the same form as in the Lambda Form semantics a closure consists of a lexical environment

and a Lambda Form expression This expression is ignored instead the integer index that is part of the

App no de is used to lo ok up a Code expression in the call cache

The closure need not b e analyzed by the semantics at all at the Co de Form level all functions are combinators

ie have no state The semantics only binds the arguments to the formal parameters this gives the

environment in which the functions b o dy is evaluated

eval EKFAppnamearg arg

n

CFexpr

let

funcv v code E Ename

n func global

i n val eval Earg

i i

CFatom

in

eval E v val v val KFcode

global n n func

CFco de

Throwing and Catching Exceptions

Throwing an exception is p erformed by calling the failure contin uation with the supplied arguments

eval EKFThrowtagarg arg

n

CFexpr

Ftageval Earg eval Earg

n

CFatom CFatom

The tag identies the exception that was thrown Catching exceptions involves the creation of a new failure

continuation When called bya Throw this function tests the tag that was thrown if it is not part of the

list of tags to b e catched the exception is rethrown

eval EKFCatchtag tag code

k

CFexpr

eval EKFcode

CFco de

where

F tagv v

n

if itag tag then Kv v

i n

else Ftagv v

n

In SML the typ e a Default is dened as NONE SOMEa The semantics is written in a metaSML in which the SOME

can b e dropp ed

In practice this should never happ en

For all practical purp oses the closure typ e can b e remo delled by a tagged tuple ie closure tag tag env

c

CF l

Chapter Execution Trace Reconstruction

Evaluating Declarations

Evaluating a declaration involves evaluating the expression of the declaration with a new success continuation

eval EKFDeclvarsvar var expre

n

CFdecl

eval EKFe

CFexpr

where

K v v

n

var KE var

n

nv v

When called with the results the success continuation will up date the old environment with the new bindings

and call its continuation with this new environment

Evaluating Code expressions

A Code expression is evaluated in two stages rst the declarations are sequentially evaluated resulting in

a nal environment Then using this environment the atoms are evaluated that make up the return values

eval EKFdecl decl atom atom

n k

CFco de

if n then

Keval Eatom eval Eatom

k

CF atom CFatom

else

eval EKFdecl

CFdecl

where

K E

eval EKFdecl decl atom atom

n k

CFco de

This concludes the semantics of Co de Form expressions

Relation b etween Abstract Results and Co de Attributes

This section discusses how and whyeach abstract Result gets its code attribute and what the semantic link

between them is

The approach to co de representation and generation that is taken in the developmentofac is an evolutionary

one Beginning with a standard semantics an abstract semantics is designed and implemented in the form of

an interpreter Co de generation is then added to this interpreter The basic restriction on co de representation

and co de generation is therefore the fact that the abstract interpreter do es not know ab out them

Toachieve this goal ie to work within the b oundaries of the abstract semantics attributes are attached

to the abstract semantic values Attributes must b e orthogonal to the abstract value mo del the evaluation

mo del totally ignores them The attributes contain not only the co de fragments themselves but also all

t information needed to keep the generated co de pieces coheren

During the abstract evaluation of the original LF program abstract Result s are generated It was already

mentioned in section that any domain of abstract values can b e extended with arbitrary orthogonal

attributes In the case of the ac abstract Result s there is such an attribute called code which carries a

value of typ e CodeFor a given abstract Result this attribute represents the co de that has to b e executed

at run time to generate the concrete result A second attribute atom whichhasanAtom as its value is

attached to each abstract Object

The TreatmentofState

The biggest conceptual problem in generating the code attribute is the treatment of state In the standard

semantics most evaluation functions return Result s that consist of an ob ject and a state Often the state

Chapter Execution Trace Reconstruction

will b e identical to the state that was passed to the semantic function as part of the evaluation context

sometimes the state has changed For the purp ose of this discussion a state is a set of reference value

denitions similar to an environmentora tupleofvalues There are only a few p ossible transformations that

can change a state during evaluation a new reference may b e added by a susp ension or promise creation

or an existing reference may b e up dated by forcing a promise Removing references and changing the

status of susp ensions is p erformed by the scheduler

The code attribute is intended to mo del the computation pro cess necessary to pro duce a runtime Result

An alternative approach is to attachacode attribute to each value in our semantics to each Obj As

a ma jor consequence of the p erResult attribution only one co de fragment is generated byeach abstract

semantic function

The current state is treated as a variable that denotes a set of tagged values A reference can b e accessed

via sp ecial primitives deref and update derefstate ref returns the value asso ciated with ref where

ref is a currently dened element of the set of reference tags and state is the current state variable

updatestate ref val creates or up dates a reference value The result of update is a new state

The state is supp osed to b e singlethreaded it is illegal to update a state twice or to deref a state that

has already b een up dated If a Code fragment do es not contain update applications and do es not call

sideeecting functions it is sideeect free

atom a Co de Attribute for Obj s

The atom attribute was intro duced to solve the problem of identifying values In the context of abstract

interpretation it is p ossible to generate values that are equal but not identical For example if a is b ound to

anInteger a and a are b oth anInteger but they are dierent When generating co de it is necessary

to know the expression a given ob ject is b ound to Therefore a second Co de attribute has to b e intro duced

the atom Each Obj has an atom attribute which describ es the atomic expression that will denote the Obj

at runtime The standard and abstract op erations on Obj sdonotchange they ignore the atom

The atom attribute is only useful when seen in the context of a code fragment One can visualize the relation

between atom and code as that of a p ointer to a no de in a directed graph most atoms will b e variable lo okups

which refer to their denitions in the code The code canbeviewed as a graph since the declaration list

of a Code fragment can b e arbitrarily reordered as long as the defuse relationships b etween the denitions

are fullled Since there are no implicit defuse relationships b et ween declarations the declaration list can

b e assumed to b e unordered

Example StraightLineCode

In the following co de generation shall b e traced stepbystep for the following ALDiSP program fragment

let

x delayy

in

y x x

end

This design decision was not taken lightly It to ok me quite some time to recognize that it is the Result s that have state and

must therefore b e annotated with the code attribute not the Obj ects Most published literature annotes the values b ecause

literature do es not describ e partial evaluators that have to cop e with state exceptions and multiple results

There is no practical dierence b etween creating and up dating a reference There is only a xed set of reference tags no

new references can b e generated at runtime The storage for reference denitions is therefore allo cated statically at compile

time

The precision of the abstract equality test can b e improved since ob jects with identical atom must b e identical

Chapter Execution Trace Reconstruction

This example while rather tiny exercises co de generation for literals primitive applications a function

application tuple creation all op erations related to references and the handling of declarations

We assume that y is a variable dened in some outer scop e and b ound to a nonliteral value This value

has an atom attribute which shall b e called atom Let the initial state b e b ound to the CF variable

y

stateinitial

The ALDiSPtoLambda Form transformation cf chapter will have generated an LF expression

let

x delaylambda y

tag

in

y x x

end

Evaluation starts with eval whichcallseval to evaluate the declaration of x eval calls

decl decl

exprLet

eval on the delay application eval calls eval to evaluate delay and its one argument

expr exprs

exprLapp

the Lambda expression eval Litdelay gives rise to the rst Result

expr

delay

The atom attribute of the returned delay ob ject is the CF atom literal ALitdelay or simply delay

eval gives rise to a closure containing the binding of its free variables y and The semantics

expr Lambda

mentions an auxiliary function free that determines what variables are closed over free do es not knowthat

y is a global function and will include it in the closure For co de generation purp oses a dierent function

freecodegen determines the variables b ound in the code expression Since the Lambda has no arguments

the eval application that evaluates the argumenttyp es calls its continuation immediately and creates

exprs

no co de The code asso ciated with the Result is mktupleatom since tuple creation is not atomic a

y

variable tmp is intro duced The full code is

tmp mktupleatom

y

tmp

The Closure ob ject has tmp as its atom attribute Note that there is no connection b etween this co de and

the co de generated earlier for the delay primitive The dierent co de attributes havetobecombined later

on

Now eval nishes and calls its continuation function whichpoints backinto eval Here

exprs

exprLapp

deref is rst applied to the function argument Since delay is not a reference eval calls apply

ob js expr mappable

to do the actual application

apply delay calls test to determine whether the application has to b e mapp ed test

mappable apply apply

decides that delay is a primitive and calls test which returns true since all primitives are

apply primitive

assumed to b e dened on all arguments apply then calls applywhich in turn calls apply

mappable primitive

which lo cates delay in its list of nonstrict primitives and calls apply Finally the semantic

primitivenonstrict

action asso ciated with delay is p erformed a new reference is allo cated in the currentstateanUprom

cf chapters and containing the closure is stored in that reference the tag that addresses the

Without syntactic sugar the transformed expression lo oks likethis

LetLappLprim

LappLprimLvaryLvarx

Lvarx

Ldecltrue x LappLprimdelay

LambdatagLappLprimLitLvary

This is hard to read and adds no relevant information In all following examples a hybrid syntax will therefore b e employed

Asaconvention large semantic functions like eval are group ed into rule sets relating to the kind of expression they

expr

take as their rst argument

eval calls eval indirectly via eval Such small evaluation steps and indirection will often b e omitted in the

expr exprs

decl future discussion

Chapter Execution Trace Reconstruction

reference in the state is returned as Result The tag in the following called tag is a literal and has as

delay

itself as atom

For each primitive function application a corresp onding Code ob ject is created The delay function emits

co de that consists of an update primitive The result of the update is a new state for whicha new

variable has to b e intro duced

state updatestate tag tmp

new initial delay

tmp

Note the missing declaration for tmp which is currently not visible The update reects the fact that the

closure tmp has b een stored in the state it do es not make explicit that it is stored in an Uprom This

information is not needed at the co de generation level Likewise the allo cation of a reference tag is not made

explicit b ecause the tag is not an entity that changes at runtime

apply returns its result via apply apply and apply to the second contin

primitive mappable

primitivenonstrict

uation of eval Thiscontinuation returns to deref which returns to the rst continuation of

ob js

exprLapp

eval which returns to eval

exprs

exprLapp

At this p oint co de fragments are combined for the rst time The necessity for this can b e seen by lo oking

at the typ e of eval itevaluates expressions to results but calls its continuation function with a list

exprs

of ob jects Since results carry a co de attribute which ob jects dont have there is a loss of information in

passing along the ob jects to the continuation When the continuation returns this information has to b e

added again to the nal result otherwise the co de would b e lost

This is done by concatenating the declaration lists of the co de attributes of the Result sthatwere computed

and prexing them to the result that is passed back eval calls itself recursively for each elementin

exprs

the expression list it evaluates each of these calls strips the Obj from one Result and puts the Result s

declaration list back up on return In our example eval has evaluated two expressions of which the

exprs

rst delay has an empty declaration list the second the Lambdacontains one declaration eval

exprs

returns its Result still the reference tag that denotes the Ususp with the co de attribute

tmp mktupleatom

y

state updatestate tag tmp

new initial delay

tmp

which returns to the eval that was called from eval Again the ob ject is stripp ed to eval

exprs decl

exprLapp

from the result and passed along to the rst continuation of eval Thiscontinuation creates a singleton

decl

environmentinwhichthevariable x not the atom of the ob ject to b e b ound but the ALDiSPlevel variable

is b ound to the tmp ob ject eval then calls that continuation that was passed down to it whichisthe

decl

rst continuation eval

exprLet

Wenow start to evaluate the yxx expression by calling eval with it

expr

eval calls eval for its three arguments yxandx eval calls eval which

exprs exprs

exprLapp exprLit

returns a Result that denotes the primitive eval strips the Obj from this Result and recurs The

exprs

next incarnation of eval invokes eval on the yx expression which calls eval with the

exprs exprs

exprLapp

functionargument list yx The second evaluates identical to the rst

When applied to y eval accesses the value of y and generates co de for it This co de is simply y

exprLvar

we assume that the ob ject to which y is b ound has y as its atom The atom attribute of the Obj is therefore

y

Evaluating Lvarx works identical to Lvary and returns the reference tag ob ject that denotes the Ususp

that was constructed earlier on eval calls the continuation of eval which calls deref with

exprs ob js

exprLapp

the function which calls apply which calls applywhich calls apply

mappable primitive

In real life is overloaded for p edagogical purp oses this example sticks to primitives Following the execution trace

through the convoluted hierarchyof apply functions is bad enough

Chapter Execution Trace Reconstruction

Since is a strict primitive deref is called with the values of y and x Since the value of x is refers

ob js

to a Uprom deref calls eval with the closure of the Uprom and the current state Just to b e sure

ob js thunk

eval rst pip es the closure through another instance of deref this is needed to cop e with nested

thunk ob js

references A CF declaration is generated to mo del the state lo okup

tmp derefstate tag

new delay

tmp

Then it calls apply with an empty argument list Without any testing apply calls apply There is no

closure

call cache entry that corresp onds to the closure therefore a new one is intro duced The closures environment

is extended with the empty parameter binding list and eval is applied to the b o dy expression y

expr

Skipping over evaluation steps that would b e similar to the evaluation of the delay the result is an

abstract value with code

tmp lookuptupletmp y

y tuple

tmp tmp

y

tmp state

input

Parameters tmp and state are a newly generated up on each closure application and intro duced into

tuple input

the parameter list at the rst and last p osition resp ectively It is assumed that each function can mo dify the

state therefore a state parameter and return value is always generated Also it is assumed that all function

take a closure argumet If a function turns out to b e a combinator or to not access any of its lexically b ound

variables the tuple parameter is removed by a later optimization as is the state parameterargument when

the function is sideeect free

This Result and its code is memoized in the call cache entry apply generates a new code fragment

closure

that represents the call

tmp state Apptag tmp state

newer cce new

tmp

The tag is the call cache entry tag of the application This is later needed to retrieve the co de It is not

cce

p ossible to inline the co de yet since recursion mightbeinvolved

While apply knows that no state change has happ ened it has to create a new state variable and bind

closure

it The same function might after all change the state when called in another context

apply also intro duces a new variable tmp to hold the result of the application and changes the atom

closure

of the result ob ject to tmp The state is passed into the application and the right side of the application

contains a multiple denition for b oth a fresh state variable and the result Con trol returns from apply

closure

to apply which pip es the result through norm to asserts that it is not an exception

thunk result

norm returns the Result to the deref that had initiated the Uprom evaluation The Uprom is replaced

result ob js

byanEvRef in the current state The state lo okup co de is added at the b eginning and the state up date

co de is added at the end of the Uprom co de

tmp derefstate tag

new delay

tmp state Apptag tmp state

newer cce new

state updatestate tag tmp

newest newer delay

tmp

norm then calls its continuation with the normalized Obj list whose atoms are ytmp That contin

result

uation pro ceeds with applying to the normalized arguments apply and eval return with the code

exprs

tmp atom tmp

y

tmp

Here a small change to the semantics is in order norm returns a Obj State tuple instead of a result ie it strips

result

a Result It is not p ossible to reattach the stripp ed declarations to the atom of the Obj since norm do es not takea

result

continuationBychanging the range of norm to Result this problem disapp ears norm o ccurs three times in the

result result

semantics each of these is easily repaired This change in the semantics do es not change its b ehaviour It is still regrettable

that the semanticshadtobechanged to accomo date the co de generator

Chapter Execution Trace Reconstruction

The second p ending invo cation of deref attaches the dereferenceapplyup dating co de to the result and

ob js

returns the rst p ending invo cation passes the result directly back

This nishes the evalexprLapp of the inner addition Control returns to the outer additions eval

exprs

which encounters the variable x and evaluates it to its reference value Skipping the next few steps evaluation

arrives at the next applyprimitivestrict Again deref is called with two arguments of which the

ob js

second is a reference with x as its atom This time however the reference is b ound to an EvRef The co de

that mo dels the lo okup is generated

tmp derefstate tag

newest delay

tmp

and the Obj that was stored in the state gets tmp as its new atom the old one tmp is still visible in the

current context but the co de generator cant know this deref calls its continuation which nishes the

ob js

application and generates the code fragment

tmp tmp tmp

tmp

Control returns to deref which attaches the deref preamble and returns the code

ob js

tmp derefstate tag

newest delay

tmp tmp tmp

tmp

to the enclosing second continuation of eval This returns through deref which attaches the

ob js

exprLapp

forceapplyup date code generating

tmp derefstate tag

new delay

tmp state Apptag tmp state

newer cce new

state updatestate tag tmp

newest newer delay

tmp atom tmp

y

tmp derefstate tag

newest delay

tmp tmp tmp

tmp

Finallycontrol returns to the contin uation of eval which returns through eval to

exprsLet declLdecl

eval where the missing declaration for tmp is prexed Evaluation nishes with a code attribute

exprs

that is

tmp mktupleatom

y

state updatestate tag tmp

new initial delay

tmp derefstate tag

new delay

tmp state Apptag tmp state

newer cce new

state updatestate tag tmp

newest newer delay

tmp atom tmp

y

tmp derefstate tag

newest delay

tmp tmp tmp

tmp

This concludes the rst example The co de that was generated is somewhat inecient giving ample moti

vation for the optimizations that are presented next

InterFunction Optimizations

The rst group of optimizations that shall b e considered is concerned with optimizing function calls ie

Apps by either mo difying a functions callreturn interface or by outright eliminating the application

Chapter Execution Trace Reconstruction

Inlining

Inlining of CF function applications is an obvious optimization The decision to inline a sp ecic function

application is deferred to the reconstruction phase Too much inlining can lead to co de explosion accompa

nied by an increase in compilation time that is more than linear in the co de size There exists no optimal

inlining strategy since there is no direct spacetime tradeo once a function application has b een inlined

which will have increased the co de size many other optimizations which might reduce the nal co de size

b ecome p ossible The heuristics that decide up on inlining in ac are fairly obvious

All applications to trivial functions are inlined An application is deemed trivial if the inlined co de is

of the same size or smaller than the co de which it replaces ie the function application The size

of a piece of co de is measured by its reducibility

If a function is applied only once it is inlined

If a function is applied only once in a given execution path and if its size is less than it is inlined

If a function is applied in the lexical context of its closure tuples creation and if its size is less than

it should b e inlined This avoids unneccessary tuple creation and lo okup op erations

The mechanism of inlining is trivial given a call cache entry

CallCachetag decl decl expr expr p p

f j k n

and a function application

res res Apptag a a

k f n

Then the application can b e inlined as

p a

p a

n n

decl

decl

j

res expr

res expr

k k

The inlining mechanism do es not substitute an a for each o ccurrence of a p Eliminating the p a decla

i i i i

rations is considered to b e a separate optimization implemented as a p ostpass

As part of the inlining all variable names that are lo cal to the residual co de this includes the parameter

names are mapp ed to new unique names This guarantees that no nameclashes can o ccur when a function

is inlined more than once in the same code expression

Assuming that the App in the example is inlined the co de will b e

That is a function is trivial if its b o dy has a reducibility

The reducibility do es not necessarily corresp ond to the nal assembly co de size For example replacing a variable with a

literal value might largely increase the size of a program if the literal is not atomic eg if it is an array The currentbackend

folds multiple copies of the same literal into on representation so that this sp ecic eect can b e ignored in the reconstruction

phase

The numberisanobviously arbitrary heuristic it can b e changed at anytime

Again an arbitrary heuristic

Chapter Execution Trace Reconstruction

tmp mktupleatom

y

state updatestate tag tmp

new initial delay

tmp derefstate tag

new delay

state state

input new

tmp tmp

tuple

tmp tmp y

y tuple

tmp tmp

y

state state

newer input

tmp tmp

state updatestate tag tmp

newest newer delay

tmp atom tmp

y

tmp derefstate tag

newest delay

tmp tmp tmp

tmp

The renaming of variables is indicated by the ticksymb ol ie tmp has b een renamed into tmp

The example presumes that the transformation whichintro duces the explicit tuple closure parameter cf sec

tion has already b een applied to b oth the example code and the call cache entryThetmp variable

tuple

holds the tuple tmp holds the value from y that is extracted from the tuple

y

Grouping

A problem closely related to inlining is the grouping of functions in the call cache Given a function f that

is called times during the execution of the program howmany instances of the function shall exist in the

nal co de In the call cache there will exist a distinct residual co de attribute for each of these calls It

is often p ossible to share one co de representation for a group of these residuals This do es not improve the

co de quality ie it do es not increase the run time sp eed but it decreases the global co de size Since co de

explosion is one of the gravest concerns of partial evaluation the grouping of residuals is very imp ortant

Grouping can b e much improved if small changes to the residuals are allowed

Again some heuristics are used

If a subset of the applications generate the same mo dulo renaming residual co de this subset can b e

trivially group ed

If there is a set X of applications that dier only in the values of a literal c which has the same basic

typ e in all residuals in X then c can b e replaced bya variable which is added as an extra parameter

to the function interface No eciency is lost since there exists no p ossibility for valuedep endent

optimizations all such optimizations have already b een p erformed by the abstract interpreter

Tabulation and Caching

Functions can b e represented by tables when their domain is known at compile time and they p erform no

side eects The most natural candidates for such a representation are functions of one or two argumen ts

when the arguments are known to b e restricted to a small enumerable typ e Such functions corresp ond to

vectors and matrices Tabulation is only useful if the tabulated function p erform a nontrivial computation

The time needed to compute a functions must b e compared with the size of the nal table and the time for a

lo okup op eration In the case of connected argument sets ranges of integers with no holes cost of lo okup

can b e approximated as one multiplication and one addition for each argument plus a memory lo okup

Heuristics guiding tabulation are

The dierent calls mighthave b een mapp ed together if they have exactly the same arguments but even so simple a dierence

as that b etween fanInt and fanInt might cause the creation of dierententries

Chapter Execution Trace Reconstruction

If the index set has more than elements no tabulation should take place

If the computed function is k reducible with k no tabulation takes place is of course an

arbitrary choice reducibility mightbeweighted for the dierent costs of primitive functions

If the index set is not a numeric typ e a lo okup function has to b e generated that either p erforms

a structural search or employs some kind of hashing This increases the cost of tabulation to a

logarithmic numb er of comparison op erations and LcondLselect no des if timing is considered and

is linear in co de size

If the index set is very small tabulation should always b e done

A function is tabulated by replacing its b o dy with a Select expression Each table entry corresp onds to one

arm of the Select there is no default case This table representation gives the backend the most exibility

in representing the table

Elimination of Parameters and Return Values

Parameter eliminationremoves parameters of a function denition if these are never used in its b o dySuch

parameters may o ccur due to function sp ecialization ie whenever a parameter is only used in a conditional

branchthatwas removed b ecause the condition was known To b e correct all calling sites of the function

must b e identied and the arguments that match the eliminated parameter must b e removed This may

intro duce new dead co de in the calling sites

If a function returns a value that is statically known either b ecause it is a parameter to the function or

b ecause it is a literal said value can b e eliminated from the functions expression list the variables that

are b ound to this parameter at the calling sites have to b e reb ound to the statically kno wn value This

optimization sp ecically addresses state variables that are passed through functions unchanged As a side

eect this may remove the last use of a parameter thus making it eligible for parameter elimination Return

values can also b e removed if they are never used

Variable Splitting

If a function accepts a parameter that is b ound to a comp ound value ie a tuple or the state and this

function do es consume the parameter without passing it on and if only a subset of the elementofthe

comp ound are accessed then it is an optimization to break the comp ound into its parts and transform

the comp ound parameter into a set of part parameters The generalized application of this transformation

amounts to splitting variables that hold comp ound values into groups of variables each of whichholdsa

subvalue of an atomic typ e

This optimization is targetted at functions that access reference values from a state but do not change the

state It also serves to break up closure tuples thus avoiding the mktuple creation If the lifetimes of

comp ound parameters dier breaking them up may optimize their allo cation b ehaviour since a comp ound

has the lifetime of its longestliving comp onent

Comp ound breakup for state variables do es not gain as much from a variablepassing p oint of view since

the backend will implement reference variables in terms of imp erativevariables this is made p ossible by

the singlethreadedness of the state However the function sharing optimization b ecomes feasible

In any case generating such co de is nontrivial and p ossibly not worth the eort ac do es not consider generating co de for

noninteger index sets

The alternativewould b e a direct implementation as a global array literal

Chapter Execution Trace Reconstruction

Basic Optimizations

The co de that was generated for the example suers from obvious ineciencies This section shows some

analysis and transformation steps that will remove most of them Most of these steps are part of the

real reconstruction phase they cannot b e p erformed ere the full program has b een interpreted These

optimizations are called basic b ecause they work on the level of the Code expression ie they do not

consider the interface b etween functions

Variable Propagation

The example co de shows many o ccurrences of variable renamings ie denitions of the form xySucha

denition can b e transformed bychanging every succeeding use of x into a use of y Due to this change x

will b ecome a dead variable and its denition dead co de Variable propagation mayhave to b e iterated

since successive renamings of the same value havetobeconsidered

In the inlined example variable propagation applies to the set state tmp state tmp

input tuple newer

The variable state gives an example for iterated renaming After propagation the co de is

newer

tmp mktupleatom

y

state updatestate tag tmp

new initial delay

tmp derefstate tag

new delay

state state

input new

tmp tmp

tuple

tmp tmp y

y

tmp tmp

y

state state

newer new

tmp tmp

state updatestate tag tmp

newest new delay

tmp atom tmp

y

tmp derefstate tag

newest delay

tmp tmp tmp

tmp

DeadCo de Removal

For eachCodeForm co de fragment the output consists of the set of expressions plus the last state Any

variable binding that do es not contribute to the output is not alive it can b e eliminated as dead co de

The sole exception to this is the state variable which is an considered an implicit output value

The most simple of dataow analyses can b e p erformed to compute the set of living variables if a variable

o ccurs in the output list it is alive If a variable is part of an expression that denes a alivevariable it is

alive itself

For the example in its inlined and forwarded form deadco de analysis shows that the variable set ftmp

tuple

tmp state state g is not aliveany more therefore their dening declaration can b e removed

input newer

resulting in the co de

Basic do es not refer to basic blo cks

Chapter Execution Trace Reconstruction

tmp mktupleatom

y

state updatestate tag tmp

new initial delay

tmp derefstate tag

new delay

tmp tmp y

y

tmp tmp

y

state updatestate tag tmp

newest new delay

tmp atom tmp

y

tmp derefstate tag

newest delay

tmp tmp tmp

tmp

Later transformation steps might create new dead co de and variable renamings these twostepshavetobe

rep eated a few times Since they are cheap to run mostly linear this do es not imp ose any costs

Reference Tracing

A sp ecialpurp ose optimization can b e used to trace updatederef calls a deref can b e removed ie

replaced byanidentity for the up dated atom if the matching previous update is visible

An update can b e removed ie replaced by an identity for the state if there is no following derefand

a following update on the same reference tag is visible

These analyses will remove the rst update and b oth derefs from the example co de whichthenis

tmp mktupleatom

y

state state

new initial

tmp tmp

tmp tmp y

y

tmp tmp

y

state updatestate tag tmp

newest new delay

tmp atom tmp

y

tmp tmp

tmp tmp tmp

tmp

Many new variable identities are intro duced during this step propagation and deadco de elimination leads

to

tmp mktupleatom

y

tmp tmp y

y

tmp tmp

y

state updatestate tag tmp

newest initial delay

tmp atom tmp

y

tmp tmp tmp

tmp

Tuple Tracing

In analogy to reference tracing tuple lo okups can b e removed if the matching mktuple creation is visible

Since most tuples are generated by lamb da applications tuple tracing is really an analysis of static variable

binding times This pro cess removes references to the tuple and may lead to them b ecoming dead Removing

tuple lo okups from the example changes the denition

Matching means referring to the same reference

Chapter Execution Trace Reconstruction

tmp tmp y

y

into

tmp atom

y y

This intro duces another variable renaming it also removes the last use of tmp Propagation and deadco de

removal leads to

tmp atom

y

state updatestate tag tmp

newest initial delay

tmp atom tmp

y

tmp tmp tmp

tmp

A formal framework for reference and tupletracing that go es b eyond mere lo cal scop e optimizations can b e

found in the path semantics of Bloss which is employed in and to gain usage information

allowing safe destructive up dates for arrays or other structures

This concludes the optimizations that are implemented in ac and apply to the example To free the example

from its last unnecessary update a nonlo cal analysis is needed since the reference tag might b e used

delay

somewhere else in the program This is imp ossible in the example since xtowhich the reference was b ound

is not b ound to a susp ension and was never exp orted from its dening scop e but that information is lost

at this stage of reconstruction If x were b ound to a susp ension instead of a promise removing its denition

would change the seman tics since susp ensions mightwakeupeven when there are no references to them

this is not true for promises

Example Blo cking Co de

When a strict primitive encounters a reference argument that p oints to an unevaluated susp ension the

computation blocksaBlock is created and a reference to it is returned The handling of Blocks is one of

the more complex tasks of the reconstruction phase since it involves knowledge ab out the internal structure

of the semantic functions Since the treatmentofBlocksissodierent from the treatment of the other

susp ensions Ususps Wsusps etc an extra example is needed to demonstrate the generation of co de for

Blocks First however an indepth discussion of Blocks is in order

Whats in a Block

A Block is a reference ob ject similar to a p ending susp ensios ie a UsuspWsusp Blocks are simpler

than an ordinary susp ension in that their condition is not an arbitrary closure instead it is xed to

isAvailableref aBlock expresses a directly datadep endency some expression has blo cked b ecause

it needed access to a p ending susp ension Blocksalsointro duce extra complications since they allows no

introsp ection while ordinary susp ensions contain Lambda closures Blockscontain semantic continuations

ie a closure of an interpreter function

The LF semantics is denotational by denition but not in its spirit the idea b ehind a denotational semantics

DS is the asso ciation of languagelevel concept with their intuitive mathematical counterparts For

example numeric ob jects should b e mo delled by real or rational numb ers and functions by continuous

Some other expression contexts also enforce strictness the condition of a Lcond or Lselect is strict as are the typ es in

function argument lists and LcheckLcast no des Also function arguments are strict if they are needed to resolveoverloading

It mighthave simplied the semantics if the transformation phase had intro duced a strictidentity primitiveinto the Lcond

and Lselect no des Such an approach could however not b e used to cop e with the selective strictness needed for function

arguments and parameter typ es

Chapter Execution Trace Reconstruction

functions This is a ne approach to language mo delling since the whole p ower of standard mathematical

notation can b e employed in describing language features but it creates a problem when ecientcodehas

to b e generated In eect a co de generator for the mathematical theory that underlies the denotational

semantics has to b e written Denotational semantics is for all practical purp oses a typ ed lazy functional

programming language Muchofthepower of DS stems from the data typ es it provides rst amongst them

inniteprecision numb ers p ossibly innite sets and functions that need not b e dened constructively

It may b e harder to create a correct DSbased co de generator than it would b e to directly translate the

language which is mo delled by the DS While numb ers and other transparent ob jects do not cause

great problems functions cannot b e directly mapp ed to most machines Amongst other things functions

are opaque semantic ob ject typ e if a functions domain is not nite there exists no guaranteed way

to describ e a function using nite metho ds it is not even p ossible to compare them for equality While

programming language functions are build from discrete comp onents mathematical functions can b e dened

in anynumber of ways

The LF semantics tries to avoid opaque ob jects whenever p ossible Functions are therefore mo delled as

co deenvironment tuples in which the active part apply is implicit and the same for each function

closure

Susp ensions are not mo delled as p ending continuations of the semantics but as user functions The only

exception to this are the Blocks Sections and showtwo alternative strategies of implementing

Block s that avoid semantic continuations and give the reasons why they were not chosen

What exactly is a Block First and foremost the encapsulation of a semantic function continuation There

are ab out o ccurrences of in the semantics most of them are motivated by the p ossibilityofa Block

ie these are the p ossible entry p oints for a Block Just like LF closures semantic closures contain lexical

variable binding The variables are the parameters given to semantic functions and the intermediate

denitions used within the functions A typical example for suchavalue is the partiallydereferenced list

that is b ound in the continuation of deref

ob js

Alternatives to Blo cks CPS

The rst alternative to the current Block implementation is the adoption of continuationpassingstyle CPS

This approach also solves all problems related to exception raising A CPS semantics was considered but

is not used b ecause co de generation for such a semanticsdoesfaceanentirely new problem In a CPS

semantics there exist one explicit continuation function for nearly every nonatomic expression During

execution this leads to the creation of zillions of closures one for eachcontinuation function Indeed

CPS closures represent rstclass stack frames of nonCPS implementations The dierence to ordinary

second class stack frames is that the latter have to b e allo cated and deallo cated in stacked order while

closures can b e kept around for an arbitrary amount of time Therefore closures must b e allo cated on the

heap instead of on a stack

Mainly for this reason SMLNJ which is based up on a CPS execution mo del allo cates ab out word

bytes every instructions executed pp SMLNJ achieves acceptable p erformance only due to

its aggressive garbage collection scheme which requires an amount of memory at least times the size of

programs have to run on sp ecialized DSP the active set When not simulating on a workstation ALDiSP

hardware in realtime suchanenvironment cannot supp ort garbage collection It might b e p ossible to

Of course only one co de generator has to b e written for the DS so the time needed to write a co de generator for the

target language has to b e compared to the time needed to set up the DS of the target language Literature describ es some

compilercompilers that directly try to generate co de from the DS most of these can only generate adequate co de when provided

with a DS that employs to olsp ecic conventions cf

Besides the problems related to co de generation for Block creation and resumption that will b e discussed now there are

consequences of the current Block implementation for the scheduler these are describ ed in section For these reasons

blo cking computations are to b e considered the biggest misfeature of the language

Many CPS runtime systems store closures on the stack and only exp ort them to the heap when they escap e their lexical

scop e Such an optimization is suitable whenever most closures b ehavelike ordinary stack frames

Chapter Execution Trace Reconstruction

eliminate the need for runtime creation of most closures by compiletime allo cationreclamation schemes

this is already done in ac but I am not sure whether all closures intro duced by a CPSsemantics can thus

b e eliminated

Alternatives to Blo cks Syntax Transformations

The second approachtoavoid Blocks consists of a global syntactic transformation any p ossibly blo cking

expression is transformed eg any conditional

if x then y else z end

would b e transformed into

let tmp x

cond

in

if isAvailabletmp then

cond

if tmp then y else z end

cond

else

suspend

if tmp then y else z end

cond

until isAvailabletmp

cond

within ms ms

end

or to reduce the co de size if y or z are huge expressions

let tmp x

cond

tmp delaylambdaif tmp then y else z end

expr cond

in

if isAvailabletmp then dereftmp

cond expr

else

suspend dereftmp

expr

until isAvailabletmp

cond

within ms ms

end

There is also the p ossibility of a direct transformation without an extra isAvailable test

let tmp x

cond

in

suspend

if tmp then y else z end

cond

until isAvailabletmp

cond

within ms ms

end

This is how ever incorrect in that it intro duces extra indeterminism the expression

let x if true then else

in

if isAvailablex then

else

end

evaluates deterministically to under the standard interpretation but could deliver or after the trans

formation into

We assume that a translator would not b e so braindead as to ask isAvailabletrue

Chapter Execution Trace Reconstruction

let tmp true

cond

x suspend

if tmp then else

cond

until isAvailabletmp

cond

within ms ms

in

if isAvailablex then

else

end

The correct transformations if p erformed naivelywould blow up the size of every program by a factor of

or more if done intelligently eg byavoiding rep eated availability tests for the same variable this could

b e reduced to a mere However it still smells of syntactic overkill

There are also some cases that cannot b e transformed correctly bysyntactic means in a prepro cessing phase

namely those that are datadep endent To pick just one example the typ es of arguments to a function must

b e known whenever the function is overloaded since overload resolution works by testing argumenttyp es

Hence some arguments of some functions can blo ck applications When functions are passed around as

arguments it is not p ossible to statically determine whether a variable is b ound to an overloaded function

or not

The Blo cking Example

Since the interior of a Block cannot b e accessed it is not p ossible to directly synthesize a reasonable co de

attribute for it Instead some dummy co de is generated and rened in a p ostpro cessing step The

following example shall demonstrate this This example is not explained in the detail of the rst one the

exp osition concentrates on the problem areas

Consider the ALDiSP program fragment

let

x suspend y until true within ms ms

in

xy

end

which translates to the LF co de

let

x suspendLambday

Lambdatrue

msms

in

xy

end

the rst part up to the x will translate into the declaration fragments

tuple mktupleatom

expr y

tuple mktuple

cond

tmp suspendtuple tuple

susp expr cond ms ms

state updatestate ref tmp

new old susp susp

After evaluating the literal the rst reference to x and the literal eval calls apply to

primitives

exprLapp

apply the apply invokes derefobjs to reduce its arguments to normal form since is strict

primitives

As the rst example has shown these declarations are distributed over many p ending instantiations of eval and

exprs

deref For didactic purp oses we consider them as a whole set of already generated co de pieces ob js

Chapter Execution Trace Reconstruction

Now the reference ref the result of evaluating xisencountered deref blo cks the return value is

susp ob js

a new reference ref b ound to a Blockref f susp ension The function f encapsulated

blo ck susp magic magic

in the Block is a continuation of the semantic function apparatus itself and cannot b e further analyzed It

can howeverbemodelledasifitwere an ordinaryifunknown Lambda expression

When later on the blo cked evaluation thread continues this takes place in an evaluation context similar to

the current one basically it is the same context but with the thencurrent state Any co de that will b e

generated for the continuation will therefore need the currentcontext Co de that mo dels the preservation of

the currentcontext must therefore b e created In analogy to the standard closure creation this is mo delled

byamktuple application that binds the current lexical context In analogy to suspendablock primitive

mo dels the creation of a blo ck ob ject that is stored in the state

tuple mktupleatom

expr y

tuple mktuple

cond

tmp suspendtuple tuple

susp expr cond ms ms

state updatestate ref tmp

new old susp susp

tup mktuple

blk

tmp blockref tup

blk susp blk

state updatestate ref tmp

newer new blk blk

ref

blk

The problem is to determine the set of variables that have to b e captured in the mktupleAt the p oint

where it is generated ie in the semantic functions block and deref no information is available ab out

ob js

the variables captured in the Block The only upp er limit is set by the fact that the captured variables

must come from the currentCodeForm context The set of lexical variables is determined syntactically

by collecting all free variables of the Code expression up to the p oint of the block This list of b ound

variables is not readily accessible at the p oint where the Block is created therefore a hack is employed the

mktuple application is initially left empty whenever a Code ob ject is passed upward and recombined

with other Code ob jects any mktuple that serves a block is patched with the CF variables that b ecome

visible through the recombination

tinues with the evaluation of y followed by the application of Skipping details deref Evaluation con

ob js

will encounter the reference argument that p oints to tmp Thisgives rise to a second Block b ound to a

blk

new reference ref and with co de

blk

tuple mktupleatom

expr y

tuple mktuple

cond

tmp suspendtuple tuple

susp expr cond ms ms

state updatestate ref tmp

new old susp susp

tup mktuple

blk

tmp blockref tup

blk susp blk

state updatestate ref tmp

newer new blk blk

tup mktuple

blk

tmp blockref tup

blk blk blk

state updatestate ref tmp

newest newer blk blk

ref

blk

Evaluation control returns to eval which creates no co de of its own and simply terminates When

exprlet

the enclosing function of which the example is a subexpression terminates the two mktuple creations are up dated to

Chapter Execution Trace Reconstruction

tup mktupletuple tuple tmp

blk expr cond susp

tmp blockref tup

blk susp blk

state updatestate ref tmp

newer new blk blk

tup mktupletuple tuple tmp tup tmp

blk expr cond susp blk blk

When abstract time will haveadvanced by milliseconds the abstract scheduler decides to evaluate the

Ususp b ound to ref This happ ens at the statefunction level ie the evaluation and the following

susp

up date make up a state transformation It is guaranteed that all susp ensions are only evaluated once so the

co de generated for susp ension evaluations and blo ckcontinuations can b e inlined and makes up the entire

state transition function

eval is called with the LF function y and the arguments tuple and state the thencurrent

thunk expr later

state The tuple was found by lo oking up the ref in the state The optimized inlined etc result

expr susp

has as co de

tup lookupstate ref

susp later susp

tmp tup y

y susp

tmp

y

The atom that was asso ciated with the abstract ob ject b ound to y is discarded An update is generated

for the ref due to the inlining it can b e directly app ended to the thunk co de

susp

tup lookupstate ref

susp later susp

tmp tup y

y susp

state updatestate ref tmp

later later susp y

state

later

The result of the state transforming function is sp ecied is the new state

The abstract scheduler can nowevaluate the rst Block Itdoessoby creating a preamble that sets up the

variables caught in the tuple and calling the semantic continuation with the currentstatestate

blo ck later

The continuation incorp orates this state into the context that was active at the blo cking p oint and continues

evaluation of the blo cked semantic function in this case deref of the ref as if nothing had happ ened

ob js susp

This reference is now b ound to the abstract ob ject of the original y the application of can continue The

continuation function terminates with the return from eval and with the co de

exprLapp

tmp lookupstate ref

x later susp

tmp tmp

r x

tmp

r

After the preamble and the up date to ref are added to this the co de of the Block is

blo ck

tup lookupstate ref

blo ck later blk

name tup name for all variables catched in tup

var blo ck var blo ck

tuple tup tuple

expr blo ck expr

tuple tup tuple

cond blo ck cond

tmp tup tmp

susp blo ck susp

tmp lookupstate ref

x later susp

tmp tmp

r x

state updatestate ref tmp

later later blk r

state

later

If the denition of said atom is visible in the scop e where the result of the tuple is used reference and tupletracing will nd it

Chapter Execution Trace Reconstruction

This is the complete co de to b e included in the state transformation function for the rst blo ck

Evaluation of the second Block pro ceeds in the same manner resulting in co de

tup lookupstate ref

blo ck later blk

name tup name for all variables catched in tup

var blo ck var blo ck

tuple tup tuple

expr blo ck expr

tuple tup tuple

cond blo ck cond

tmp tup tmp

susp blo ck susp

tup tup tup

blk blo ck blk

tmp tup tmp

blk blo ck blk

tmp lookupstate ref

res later blk

tmp tmp y

r x

state updatestate ref tmp

latest later blk r

state

latest

The code attributes that are generated for the Block continuations are quite inecient due to the unnecessary

tuple applications It is esp ecially imp ortant to notice that many blo cks will b e directly dep endent

up on other blo cks and thus b e evaluated in sequence When the two statetransition functions that mo del

the blo cksev aluation are inlined deadco de and parameterelimination will take care of most of these

ineciencies

An imp ortant optimization that simplies the generation of co de b oth for the Block tuple creation and for

the tuple unpacking that takes places as preamble to the continuationsco de sequences is the lazy variable

capture After a Blocks transition function has b een created the variables that were needed from the lexical

environment of the Block creating site can b e deduced from the co de since these will b e those variables

that are referenced but undened It is p ossible to patch the original mktuple and the preamble at

this p oint since nowitisnow longer necessary to put a conservative approximation of the current context

into the Block tuple but only those variables actually needed This optimization can greatly decrease the

intermediate co de size

Sp ecication of Co de Generation

To sp ecify the co de generation semantics of the compiler it is necessary to extend the denitions of the

semantic functions eval eval apply etc This eort can b e minimized by employing the fact that

expr decl

only a few clauses of the semantic functions are resp onsible for generating new abstract results and hence

new co de Also whenever the semantics employs a literal result eg true or false the code attribute

is trivially dened byan ALit atomic expression

All functions that isolate Obj sfromResult s and pass on the Obj s to continuation functions must reattach

the original Result s codes declaration list to the code of the Result that is returned by the continuation

function

The only complex co de generation is related to the application of closures and primitives

All other clauses ie those that just insp ect Result s and pass them through unchanged can ignore the code

attribute In the following extensions for the semantic functions deref eval applyand eval

ob j expr decl

are given Most extensions are sp ecied informally

Since the rst transition function pass its results to a state function that directly calls the second transition function

inlining will merge direct sequences of Blocks

Chapter Execution Trace Reconstruction

eval

exprs

This auxiliary function rep eatedly calls eval strips the ob jects from the results and passes the ob ject

exprs

list to a continuation Since it has also stripp ed the code attribute from the results eval is resp onsible

exprs

for app ending the resultsCF declaration lists to the co de of the result that is returned from the con tinuation

If no lazy Block co de generation is used eval is also resp onsible for patching any mktuple applications

exprs

that are used to create blo cks

deref

ob js

The deref function directly analyzes the typ es of ob jects If an ob ject is a reference the referenced value

ob js

is substituted The co de fragment for this case is simply a deref primitive application If the referenced

value is a unevaluated promise the thunk has to b e evaluated and the result value to b e stored

tmp derefstate ref

thunk current

tmp tmp tmp

result combinator thunk

state updatestate reftmp

new current result

tmp

result

The tmp is represent the closure tuple of the promise It has to b e passed explicitly to the promises

thunk

combinator as the rst and sole argument The co de ie the combinator is treated like a literal or a

toplevel denition

block

Most of the b ehaviour of the block function has already b een describ ed in the second example section

Blo ck dumps the initially empty tuple creation co de and the denition of the reference that p oints to the

blo c k

tmp mktuple

tuple

tmp blockref tmp

blo ck p ending tuple

state updatestate ref tmp

new old blo ck blo ck

ref

blo ck

The reference ref is the one that caused the blo ck ref is the one that denotes the blo ck

p ending blo ck

strict and strict

exprs

strict just combine eval and deref strict extracts an ob ject from a result and passes it on

exprs exprs ob js

to a continuation function it therefore has the resp onsibility to reattach the stripp ed CF declarations to

the result that is passed back

eval and norm

thunk result

eval combines norm and deref As already b een mentioned in section norm has b een

thunk result ob js result

mo died to return a true Result typ ed result therefore no co de generation takes place

Lit

The code for a Lit expression is an emplty declaration sequence with an ALit return expression ALitvalue

Chapter Execution Trace Reconstruction

Lexical Lvar s

A lexical Lvar expression has as its co de the atom that was b ound to the value of the lexical variable This

atom has b een set up when the variable was dened

There are three dierent places of origin of lexical LF variables

Lambda parameters

variables b ound with LdeclLpardLseqdLfixdand

variables from an enclosing scop e which are extracted from the closure tuple

Lambda parameters and closure tuple variables are routineley given fresh names in each functions preamble

if only to simplify inlining LF declarations are eectively ignored at the CF level since they are kept track

of by the compiletime environment

Dynamic Lvar s

In general dynamic variables are either passed to the function that encloses the Lvar expression as

an additional parameter hidden in the state variable The generated co de employs a sp ecial primitive

lookupdynamic that accesses the current state

var lookupdynamicstate name

dyn current dyn

var

dyn

When a dynamic variable is b ound to a literal value ie either a true literal or a combinator function

function specialization can take place The dynamic variable essentially disapp ears at the p ossible cost of

an additional function sp ecialization

Lambda

The evaluation of a Lambda expression creates a closure ob ject its runtime equivalent is a closure tuple

This tuple is created using a mktuple application that binds all currently visible variables that will b e used

in the tuples b o dy

tup mktupleatom atom

free freen

tup

Each atom is the atom attached to var theith free variable in the Lambda expression

free i free i

The set of free variables in the b o dy of the Lambda expression that can b e determined statically is a conser

vative approximation it maycontain variables that are never used by the actual execution One useful

optimization of the co de generator is therefore the deferral of actual tuple creation to the p oint where all

applications of the tuple are known then the worstcase variable set can b e deteremined and used to patch

up the tuple creation co de

Lcheck

Lcheck forms reduce to a either a function application or to the creation of a new Lambda expression ie a

tuple if a functiontyp e is checked for If the Lcheck reduces to a function application that generates a

nonabstract value no co de is generated at all if the result is truethe Lcheck is replaced by its argument

expression if the result is false a compiletime error is issued If the result is an abstract value ie

aBool a runtime test may b e generated The current ac do es not generate runtime tests this is correct

since the outcome of an unsuccessfull test is undened

Actually if the closure is never applied at all no variable is used or need b e captured

Chapter Execution Trace Reconstruction

Lcast

Evaluation of an Lcast will lead to an application of one of the userdened cast functions If a cast to

a general typ e is needed and no unique matching typ e can b e found b ecause there are typ e predicates

that return aBool nested runtime conditionals have to b e generated one for each p ossibly applicable cast

function that precedes the rst guaranteed applicable cast function

Lcond

A conditional that evaluates to true or false will b e replaced bythechosen alternative If the condition is

undetermined b oth alternatives are evaluated and a Select no de with two forks is created from the co de

of the condition and the alternatives

code that computes the atom

cond

resultstate Selectatom

cond

ALittrue decls result state

true true true

ALitfalse decls result state

false false false

NONE

result

Lselect

The only Lselect no des that app ear in practical programs are generated in mo dule selection functions In

this context it is an error if no compiletime selection can b e determined Therefore no Select co de must

b e generated since one of the selection cases or the default will b e chosen

Lseq

The evaluation of a sequence is for its sideeects onlyThestrict function implements the evaluation of

and therefore also the co de generation for the expression list the evalLseq merely discards all

but the last result the nested strict applications will prex all the co de generated for the sequence Any

unnecessary co de will have to b e removed by p ostpro cessing optimizations

apply

Most of the subfunctions of apply do not generate co de they just decide whichofthemany apply

xxx

function is to b e used At the b ottom there are three dierent functions that implement the application

apply with a typ e argument that is one of the basic typ es

typ e

apply and

primitive

apply

closure

Only one of the distribution functions can generate co de namely apply While the actual applica

overloaded

tion of apply reduces to an apply since the overloaded function is just a collection of ordinary

overloaded closure

closures the selection of the closure might b e deferred to runtime In this case a nested conditional is

created In the semantics the check function is used to determine the index of the rst matching closure list

Chapter Execution Trace Reconstruction

of an overloaded function This function has to b e mo died up on abstraction since it can only return true

or false in its original form

Some co de has to b e generated for overloaded function applications namely the selection of the closure

that is to b e applied The runtime mo del of an overloaded function is a tup el that contains closures The

primitive overload mo dels the creations of such a tuple and overloadselect mo dels the closure selection

apply

typ e

Co de generation for apply or rather is is trivial since the results must b e either true or falseif

typ e base

the typ e that is tested for is a base typ e The runtime system do es not know data p olymorphism ie every

variable in the compiled CF program is of a known and xed base typ e All p olymorphic values are split up

as a side eect of function sp ecialization

When apply is used to test for a predicate typ e it reduces to a closure application

typ e

apply

primitive

Primitive function applications fall into two categories

the simple primitives are strict and sideeect free examples of these primtives are the arithmetic

primitives and the tuple accessor and creator primitives Co de generation for these primitives is trivial

apply lo oks at the result value if it can b e enco ded as a literal the result code is this

primitivestrict

literal if it is a nonliteral the co de is

result Primprimitivenameatom atom

arg argN

result

the complicated primitives are nonstrict andor have side eects In general co de generation for

these works as in the simple case but some of the sideeecting primitives have sp ecialcase co de

generation rules

mkexc is translated into a Throw expression mkexc takes an arbitrary numb er of argument the

corresp onding Throw has one additional argument atom namely the current state whichisthe

rst argument The result of the Throw is a new state

suspend and delay primitives are translated into sequences of a suspenddelay that creates a

susp ensionpromise ob ject and an update that installs the newly created ob ject in the current

state and creates a new state This has b een shown in the examples

In the current implementation co de generation for the generalized apply is not implemented and an error message

overloaded

is issued when an overloaded function application cannot b e resolved at compile time In practice most of these reect

programming errors anyhow

This is really a pseudoresult since the Throw do es not return but jumps to the last Catch on the call stack There is an

additional problem with Throwwhich is exemplied by the common idiom

t eqb

Sq if t then S ThrowSDivZeroa S

else r divab Sr

q

There is an obvious mismatch here Semantically mkexc and Throw return exception ob jects which can b e treated like

any other b ottomvaluedobjectAt the CF level there is no bottomandvariables are restricted to known base typ es To

guarantee a typ e and aritycorrect CF program each Throw therefore returns an extra dummyvalue that is typ ed accordingly

to the needs of the context These needs b ecome obvious the rst time the ob ject is merged with other nonb ottom ob jects

in a conditional The correct co de is therefore

t eqb

Sq if t then Sdummy ThrowSDivZeroa Sdummy

else r divab Sr q

Chapter Execution Trace Reconstruction

the maparray and maplist primitives are translated into library function calls

the ioop primitive is translated into an update ioop is a general IO op eration that can

b e parameterized to eect input output and test op erations In any of these cases reference

denitions have to b e created or up dated

apply

closure

When a closure is applied a lot of co de has to b e created Each closure application is assumed to create a

new call cache entry CCE Each CCE will either b e inlined or pro jected to a CF function Co de generation

assumes the latter

Co de generation consists of two tasks the atom attributes of the values held in the up dated environment

have to b e mo died and a preamble that implements these mo dications has to b e prep ended to the nal

code of the functions result Topick an example function

c

func fab abc

fxy

The closure f contains one value namely c Up on application of f co de for three bindings xy and cmust

b e created The application takes the form

r Apptag atom atom atom

cce f x y

where tag is the tag that denotes the call cache entry in which the co de for this particular application of

cce

f will b e stored and atom atom and atom are the atom attributes of the variables f x and y

f x y

For the CCE co de four new variables are intro duced tmp tmp tmp and tmp Of these all but tmp

tuple a b c c

are parameters to the function tmp is the parameter to which fs closure tuple is b ound and tmp tmp

tuple a b

are the two ordinary parameters The preambleissuch quite small namely

tmp lookuptupletmp

c tuple

rest of the code

results

The names of the newly intro duced parameters are stored as attributes of the CCE since they will b e needed

when the CCE is later inlined or translated into a function denition

Again no co de has to b e generated at all if the result is a literal value In such a case the call cache entry

is reclaimed

Related Work

This section gives a short comparison with the approaches to residual co de generation and representation

that are presented in the literature Since most published work describ es oline PE systems these shall b e

presented rst

Oline partial evaluation is usually sp ecied as a sourcetosource transformation that is guided by anno

tations generated in a bindingtime analysis BTA phase These annotations typically take the form of a

twolevel syntax The essential foldunfold transformations together with typical calculus iden

conversion form a sucient base for all mo dications encountered in oline PE ie all higher tities like

These parameters do not have the names a and b since inlining would b e complicated by such a renaming Besides

debugging is kept simpler if there are no common names b etween the LF and CF domains

In a twolevel syntax there are twovariants for each original syntactic construct In BTA these variants denote the

staticdynamic distinction cf section

Chapter Execution Trace Reconstruction

level optimizations can b e sp ecied and implemented in terms of these basic transformations Most of the

complexity of oline PE is isolated in the BTA and program representation do esnt b ecome an issue at all

This is indeed one of the strengths of the oline approach the separation of BTA and rewriting makes it

p ossible to simplify b oth phases BTA can b e implemented using simpleminded but still useful data mo dels

eg AI on twop oint domains and the rewriting phase needs nothing more then and conversion

renaming and instantiation

In online systems the program is rewritten while it is executed An online PE must therefore consist of

at least a full interpreter usually even an abstracted version of the standard interpreter

Co de generation in online PE also p oses a challenge to software engineering b ecause of is distributed nature

While the ob ject program is executed either co de fragments are generated or the ob ject program is lo cally

rewritten In b oth cases there is not much information available ab out the whole program esp ecially

ab out the execution of things that are still in the future This is a marked contrast to oline PE where it

is known by syntactical means whether a variable is used once or many times passed upward or downward

or stored in a datastructure For each of these situations a dierent co de generation approachcanbe

employed In online PE nothing is known ab out the future of lo cally computed values and the co de

generation strategy must therefore assume a worstcase context

There are twoessential approaches to co de representation that can b e characterized as b eing based on

program texts versus program graphs A text representation shares all the prop erties of a program in the

source language or its abstract syntax tree equiv alent it ob eys the normal scoping and lifetime rules and it

is implicitly sequentialized Graphbased program representations tend to resemble dataow graphs and are

usually taken to b e demanddriven some graph representations allow explicit cycles others forbid them or

allow only directed acyclic graphs DAGs Graph representations have no natural scoping and sequencing

rules that makes them particularly suited for parallel applications

Berlin employsaDAG representation for his LISPbased PE The graph is then scheduled for

optimized utilization on pip elined architectures Berlin considers typical scientic applications that have

a xed recursion structure dominated bynumeric computations he essentially optimizes inner lo ops

The FUSE system of Weise etal employs a graph representation that allows cycles

FUSE is partly oline and partly online During abstract interpretation a dataprogram graph is created

each data ob ject is attributed with the co de that generates it The graph structure emerges as a natural

consequence of these attributes The reduceresidualize decisions are delayed to the partial evaluation phase

that o ccurs after the abstract interpretation In this phase those applications that fulll certain heuristically

determined qualications are reduced This phase is simplied by the graph representation since scoping can

b e ignored Since source and target language are the same in FUSE scoping has to b e recreated and the

op erations in the program graph have to b e sequentialized This is done by computing a defp oint tree for the

whole graph from which scop e information for each no de is inferred This scop ed graph is then translated

into a text representation a Scheme program Alternatively the scop ed graph can b e translated into C

These tasks are not simplied by the p ossible need for nested and lo cal higherorder function denitions

In Weise etal apply a similar representation to C programs Lo ops and lab els are translated into

Lambda entry p oints Since C do es not provide nested rstorder functions the scoping structure is easier

to reconstruct than in FUSE The main goal in this work is to extract parallelism by removing implicit

sequencing To do this the store is split up and some p ointer analysis is p erformed

Haraldsson was one of the rst to build an online partial ev aluator as an extension to an interpreter In

the REDFUN system he intro duces qtuples which hold the reduced form and information extracted

Oline PE can thus b e characterized as an approach which tries to solve the problem of what to transform not how

Online PE using a standard interpreter is nothing more or less than runtime co de generation RTCG RTCG is feasible

for some sp ecialized applications such as sound and image decompression or BITBLT op erations cf Kepp el et al for

examples It is however hard to justify runtime co de generation as a general strategy for the compilation of whole programs

since the increase in runtime eciency that is achieved by the PE must comp ensate for the memory and time consumption of

the partial evaluator costs that can b e ignored when runtime and compiletime are dierent

Chapter Execution Trace Reconstruction

from it REDFUN is implemented in the context and as an extension of an existing LISP system and

the reduced form is a simple LISP expression the information consists of a sideeect ag true and false

context information and an assignment information The reduced forms are to b e placed in PROGN contexts

hence variable denitions will b e visible to all subsequent reduced forms Inlining is p erformed adho c and

lo cally guided by heuristics Since REDFUN is emb edded in a LISP environment the user can extend and

change the b ehaviour of many parts of the partial evaluation functions by adding prop erties to symb ols Using

such explicit mo dications of the partial evaluator particularly bad sp ecialization and inliningdecisions can

be avoided on an adho c basis and heuristics can b e adapted to sp ecic target programs

Chapter

The BackEnd

This chapter describ es syntax and semantics of the generalized machine language M that is the nal output

of the compiler M is a lowlevel language designed to b e easily translatable to the assembler languages

of current DSP pro cessors The current implementation of ac do es not generate co de for a sp ecic target

architecture instead the user can request the generation of either an M co de program or a p ortable C

program that corresp onds to it

Dierences b etween M and CF

Why is it necessary to haveyet another intermediate form The Co de Form representation is still to o high

level in that it is based on the applicative paradigm The control ow of CF programs is based on tail

recursive function calls instead of jumps lo ops and stackbased subroutine calls All CF data structures

including tuples arrays and the state are rstclass values they can b e passed around as function arguments

and result values On a real machine only atomic values small enough to t into registers have this rst

class status

The M form is dened in terms of a machine mo del that allows easy translation to most currently known DSP

and general purp ose pro cessor architectures Toachieve this goal the abstract target machine is restricted

in a number of a ways

No at address space is assumed ie there is no general p ointer concept

A sp ecial index type is used to access vector elements

A sp ecial lo op construct is provided

There is no unique data size

There is no implicit control state

In the next sections these p oints will b e elucidated

It is also p ossible to generate a dump of the nal call cache contents in Co de Form and a graphical rendering of the state

graph in GraphEd format for an example see g

Chapter The BackEnd

Disjoint Address Space Index Typ es and Hardware Lo ops

Many DSP architectures eg the DSPk family of pro cessors have the capability of addressing twoor

more separate memory banks This makes it p ossible to access more than one memory lo cation in one cycle

Many common DSP op erations eg most lters have a core of multiplyadd lo ops in which one word

from the co ecienttableandoneword from the input stream are multiplied p er iteration step and the

result added to an accumulator Having two or more memory banks available in parallel makes it p ossible

to implement one suchMACmultiplyaccumulate step p er cycle Languages such as C which dep end

up on a at memory mo del with unrestricted p ointers are not easily adapted to sucharchitectures

M supp orts target architectures with separate memory banks by not providing a general p ointer typ e As

far as M is concerned eachvector and each global scalar can b e part of a separate memory bank accessed

by distinct instruction addressing mo des

Since M shuns p ointers there are no aliasproblems all accesses to a given vector can b e found by static

analysis Thus it is simple to distribute the vectors over the memory banks

A related problem is that of index ranges M do es not allow the indexing of arrays by ordinary unrestricted

integers Instead it is p ossible to dene a variable as b eing an index into one or more sp ecic vectors

Such an index can only b e used to access the vectors over which it is declared and a vector can only b e

accessed by an appropriate index

It is allo wed to copy index values into ordinary integer variables of appropriate size and vice versa index

values can also b e incremented and decremented byinteger amounts to give new index values It is thus

p ossible to directly map index variables onto target hardware index registers if suchexist

Many DSP pro cessors supp ort hardware lo ops ie ecient incrementcomparejump combinations some

times there are even advanced features such as mo duloaddressing Such lo op instructions are usually re

stricted to situations in which the numb er of lo ops and the rst index are known in advance and the step

size is a small constant To supp ort such instructions M contains a loop construct that is restricted in such

away that hardware lo ops for most DSPs can b e generated without undue contortions

Data Formats Sp ecial Instructions and the ALDiSP Library

The size of integer and oatingp ointnumb ers is not standardized for all target architectures Esp ecially

on DSPs word sizes such as or bit are not uncommon No ecient co de for suchmachines could

b e generated for an intermediate language that enforced one standard size such as or bit On the

other hand if one machine int of unsp ecied length is used the co de would have no dened semantics

and machines supp orting more than one integer or oatingp oint format could not b e utilized M therefore

allows the sp ecication of arbitrary integer and oating p oint formats

The ALDiSP library is resp onsible for breaking the arithmetic primitives down to those sizes that are sup

ultipleprecision arithmetics with ordinary p orted by the target architecture This approach also unies m

arithmetics

A set of ALDiSP primitives denes the arithmetic capabilities of a machine This wayeven userwritten

library or program functions can inquire ab out machine characteristics Much of the complexityofatypical

backend and runtime supp ort system is thus delegated to the library

The ALDiSP library is required to only generate calls to primitives with arguments that are supp orted by

the hardware this guarantees that the generated M program is semantically sound

If the target machine supp orts sp ecial instructions suchasmultiplyandaccumulate or implements ALDiSP

primitives as combinations of multiple machine instructions eg multiplication as sequence of b o oth steps

Two primitives suce ismachtype asks whether a typ e is supp orted byagiven target architecture and

nearestmachtype gives the targettype that is nearest a given typ e

Chapter The BackEnd

the ALDiSP library can b e rewritten to supp ort those The M language treats the primitive functions as

blackbox pure functions It is assumed that the backend supp orts all instructions generated by the

frontend ie that the frontend knows what primitives are available on the target

No Implicit State

Because M supp orts multiple output op erations there is no need for implicit state in form of status ags

or the like This simplies Mlevel program transformation instructions can b e moved around as long as

all data dep endencies are maintained M primitives are fully functional they eect nothing b eside their

explicit output values This is very imp ortant for instruction scheduling which dep ends on the abilityto

freely move co de around in the program

Semantic Entities of M

M assumes a machine with a main memory consisting of arbitrary many but compiletime constant b ound

disjointvectors of scalar elements Astackofknown maximum size is supp orted Program memory is

separated from data memory There is an unb ound but xed numb er of registers referred to implicitlyBasic

scalar typ es supp orted are

intn for n MaxIntSize where MaxIntSize is a machine sp ecic constant These are the twoscomplement

integers of size n

cardn for n MaxCar dSize which usually coincides with MaxIntSize These are the unsigned integers

of size n

floatn m where n m are drawn from a machine sp ecic set of known oat formats Usually the

formats f g are supp orted An enco ding in IEEE oating p oint format is

assumed

bool is the standard b o olean typ e

indexnames is the typ e of array indices An index is restricted to p ointinto the arrays enumerated in its

declaration There are no p ointers into arrays Index variable may not p oint outside arrays Not all

indices need b e of the same size thus they are not assignmentcompatible An assignment i i is

allowed if the set of arrays i maypoint to is a subset of the set of arrays i maypoint to that is

every valid i is also a valid i

wing basic kinds of instructions There are the follo

Declarations intro duce IO ob jects functions and global variables Within functions they intro duce

stackvariables and temp oraries

Assignments move scalar values b etween lo cations Assignments are instantaneous Multiple assign

ments are simultaneous abba is a correct swap

Primitives can b e called with anynumberofarguments returning anynumb er of arguments

Jumps can movebetween lab els of the same function

The task of instruction scheduling is to nd an ordering for the instructions that minimizes register spilling and optimizes

the use of memory accesses and delay slots this is quite imp ortantifthemachine supp orts parallel moves as the kk

pro cessors do The current ac considers instruction scheduling a machinesp ecic back end issue

Chapter The BackEnd

Cal ls are used to enter another function

a Return exits from a cal led function destroying all stackvariables dened in that function

Both jumps and calls allow argument and result parameter transfer byvalue

Function Denitions

A function is a piece of co de containing at least one return statement A functions denition ends with the

header of the next functions denition The header argument list is emptyFollowing the function header

and preceding the rst executable statement or lab el are declarations of lo cal stackvariables Whenever the

function is entered with call a stack frame containing these variables is allo cated but not initialized The

call mayormay not initialize some of these variables namely those mentioned in the parameter list

A temporary variable is intro duced bya temp declaration The lifetime of the variable is up to its last use It

is illegal to use temp orary variables in recursive co de Temp orary variables are intended to mo del registers

the reconstruction phase tries to honor this intention by preferring temp orary to stackvariables

A function can have arbitrary many labels Lab els have argument lists containing only the names of lo cally

visible stack or temp orary variables

Control ow within a function is done via combinations of the if and jump directives

As examples here are t wo implementations of a wellknown algorithm

func fakn recursive fak

stack int n n has to be on the stack

temp int res the result need not be on the stack

temp bool r holds the comparison result flag

temp int n holds the n value

r n

if r

n n

res call fakn

res resn

return res

else

return

endif

func fakn iterative fak

temp int n input and loop counter

temp int res accumulator and result

temp bool r for the comparison result flag

res

label loop

r n

if r

returnres

else

res resn

n n

jump loop endif

Chapter The BackEnd

It is imp ortant to understand why the nvariable of the fak function must b e of typ e stack instead of

temp temp orary variables are not reentrant After the call fakthevalues of nrandn are therefore

undened

The next example shows the use of vectors A function that computes a vector pro duct is shown

vector int vv all vectors are global

func vtimesv

temp indexvv i

temp int sumprod

sum

loop i

prod vivi

sum prodsum

endloop

return sum

The loop form is a builtin primitive with the three parameters rst index numb er of steps and step

size The variable i should b e of typ e index or integer

It is not p ossible to parameterize a vectormanipulating function like vtimesv since there are no vector

p ointers This may seem wasteful and overly restrictive but has its reason Esp ecially in DSP pro cessors

disjoint memory banks and hardware lo op supp ort are often found A general vectorprod function could

prove hard to realize on suchchips b ecause vector sizes and alignments mayhave to b e hardco ded into the

lo op instructions

By disallowing general p ointers to vectors it is guaranteed that all vector index ranges can b e computed

and veried at assemblytime

Vectors are accessed using a hardwired subscript form Subscripting is not mo delled as a primitive but as

a builtin syntax form since it is easier to expand a subscript op erator that cannot b e implemented as an

addressing mo de into a sequence of access computations than to do the conv erse

Structure of M

The abstract data typ e of M typ es is

datatype Mtype

Mtint of int

Mtfloat of int int

Mtindex of string list

Mtbool

Mtstring

M programs distinguish b etween lo cal and global declarations Functions and vectors can only b e dened

globally A program consists of a series of global declarations the order is not imp ortant

datatype Mprogram

Mprog of Mfunc list Mglob list

Besides functions there are two kinds of global declarations those for scalars arbitrary atomic values and

those for vectors Both kinds of declarations can declare a whole bunchofvalues of the same typ e Instead

of a separate identier typ e the M abstract syntax employs simple strings

The presence of multiple declarations simplies some of the transformation rules that generate the M programs

Using strings instead of the symb ols employed by the rest of the compiler makes it p ossible to separate any assembly

generation backend totally from the rest of the compiler As usual the sp ecication shown here exactly mirrors the current

implementation

Chapter The BackEnd

datatype Mglob

Mscalar of string list Mtype

Mvector of string list int Mtype

A function declaration consists of a list of formal parameters a set of lo cal variable declarations and a

sequence of statements The parameters and lo cal variables have a scop e and lifetime of the statements it

is not p ossible to declare functionlo cal static variables

datatype Mfunc

Mfunc of string Mres list Mdecl list Mstmt list

Each Mdecl corrsp onds to an invo cationlo cal stack or auto variable When a function is called its

statements are executed sequentiallyThetwo kinds of Mdecl dier in their stack b ehaviour when a Mapp

statement function call is executed the Mstck variables are saved on the stack and restored up on return

The Mtemp variables are not saved they are undened after a function call

datatype Mdecl

Mstck of string list Mtype name type

Mtemp of string list Mtype name type

Lexpressions aka lo cations are mo delled by Mres expressions There are only two kinds of Mres expres

sions variable names and vector lo cations A variable name may only b e used in a context in whichitis

dened either via Mdecl or via Mglob A vector lo cation is denoted by the name of the vector and an

arbitrary atomic expression Marg that computes to an index value of appropriate typ e

datatype Mres result lexpr is

Mrid of string identifier

Mrindex of string Marg vector store

Margs denote atomic values Of the six Marg constructors four denote literals the other two denote the

contentofavariable that has to b e visible in the currentcontext and the con tentofavector lo cation

Note that vector access can b e nested

datatype Marg argument atomic expr is

Mint of int integer literal

Mfloat of real fp literal

Mbool of bool boolean literal

Mstring of string string literal

Mid of string content of a variable

Mindex of string Marg content of a vector element

Finally there are statements Mapp and Mreturn implement function call and return Mcall invokes primi

tives Mjump and Mlabel allow nonsequential control ow within a function Mjump can pass arguments to

the Mlabel Mif is the standard if op erator Mloop is a hardwired lo op construct

datatype Mstmt statement is

Mreturn of Marg list return with results

Mapp of string Marg list Mres list call a function Mfunc

Mcall of string Marg list Mres list call a primitive

Mjump of string Marg list goto label

Mlabel of string Mres list name arguments passed

Mif of Marg Mstmt list Mstmt list if arg then else

Mloop of string Marg Marg Marg Mstmt list loopvarinitstepsincr

An invo cation of Mloopvarinitstepsincrstatements corresp onds to

Allowing nested vector addressing suchas MindexaMindexbMidc simplies the grammar and cuts down the

numb er of temp orary variables in the generated Mco de it is trivial to atomize complex addressing mo des when generating

co de for machines that do not provide such addressing mo des

Chapter The BackEnd

var init tmp

while tmp steps do

statements

var var incr

tmp tmp

od

where tmp is an appropriate variable This kind of lo op description was chosen b ecause it is most easily

analyzed and transformed into equivalent forms

Transforming CF into M

The transformations needed to generate M co de from the reconstructed Co de Form program are mostly

trivial

State Globalization One global variable is created for each distinct reference in the state and reference

lo okups and denitions are translated in equivalent accesses to these globals This can b e done without

further analysis b ecause it is known that the state is singlethreaded

Tuple Atomization Nonatomic variables ie tuples are split up into subvariables of atomic typ e

Vector GlobalizationAllVectortyp ed variables are translated into global variables

Of these p oints only the last one is complicated There are actually three transformations involved

Introducing InPlace Primitives Some vectorcreating primitives updatevector mapvectorare

replaced byvectormo difying counterparts updatevectorinplace mapvectorinplace This

can b e done in a correct wayby rst intro ducing copies

v updatingprimitivev

will b e transformed into

v copyvectorv

v updatingprimitiveinplacev

DeadVector Elimination Whenever a vector variable b ecomes unused its last copy op eration can b e

eliminated

v copyvectorv v unused after this

v updatingprimitiveinplacev

can b e replaced by

v updatingprimitiveinplacev

Removing VectorTypedFunction ArgumentsFunctions that takevectors as arguments are copied a

dierent instance is created for every vector they might b e applied to When this transformation has

b een p erformed successfully all vectors are eectively singlethreaded and can b e made global

Finally all o ccurences lookupvector and updatevectorinplace are replaced by their M coun

terparts Mindex and Mrindex and calls to mapvectorinplace are replaced by Mloop invo cations

Much energy can b e sp entinto providing further optimizations to minimize the numb er of scratchvectors

The current ac only implements the most basic heuristic it tries to move all nonmo difying accesses to a

vector b efore the rst mo difying access

In typical DSP co de this should not p ose any co desize problems since such applications usually employ a small number

of xed vectors The current CFtoM translator uses a primitive clonevector that creates a dynamic garbagecollected copy

ofavector for functions that intro duce vectors on the y or within recursive functions The C backend might implementthis

with malloc and reference counting

Chapter The BackEnd

The C backend

The only backend currently implemented for M generates C co de The resultant co de is quite inecient

since C is not esp ecially suited as a lowlevel assembler

There is no p ossibility to get ne control over stack allo cation b ehaviour Thus the temp variables

employed in M cannot b e correctly represented

C functions cannot havemultiple output values It is p ossible to use structures for this purp ose but

those are not held in registers by most compilers

The MtoC translater generates one big switch statement in which lab els and function heads are represented

as cases The stack is held explicitly and the arguments that are passed along by each jump or call are

mo delled via global transfer registers

The Future Automatic BackEnd Generation

A more ambitious approachtomachine co de generation is the automatic generation of backends Languages

like nML give a full instructionlevel registertransfer semantics for a target architecture together

with information ab out the assembly language syntax and bitlevel co de representation In a framework like

CBC such a target description is transformed into a hardwaremo del suited as input language for

a CDFG ControlDataFlow Graph co de generator By providing CDFG output ac can b e retargetted

to arbitrary target arc hitectures It is also p ossible to transform the description into a simulation to ol

which allows targetinstruction level debugging of programs

Chapter

Conclusions and Outlo ok

The goal of this work was to establish a compilation technique able to cop e with rich languages like ALDiSP

whichhave b oth a complex semantics and stringent requirements in regard to space and time eciency but

which do not have to b e compiled with a fast turnaround time since actual software development o ccurs in

an interpreter environment and crosscompilation is the order

ALDiSP is uncompilable with the metho ds used to compile other classes of languages

ThirdGeneration Languages Fortran Pascal Mo dula C C are based on the assumption that

each source language construct can b e mo delled eectively by a small chunk of machine co de The

data typ es provided as primitive and op erations on them are essentially the atomic typ es and op erations

of the target hardware numb ers characters addresses Co de generation for such languages can b e

achieved by mapping each construct to its target co de sequence optimization mostly consists of register

allo cation and instruction scheduling

Functional and Logic Programming Languages Haskell SML are based up on abstract ma

chines and runtime systems that implement ecient garbage collected heaps for many small ob jects

Co de generation consists of creating ecient co de sequences that work as new op erators sup ercom

binators of these abstract machines Typical applications of such languages need to representand

work up on large and complex heterogenous data structures list graphs knowledge bases Dynami

cally typ ed languages LISPScheme are often not compiled down to the level of mac hine data typ es

unboxed values but only to that of the abstract machine b oxed values so that some degree of

interpretation takes place at runtime

SignalFlow Languages Silage Signal essentially describ e arithmetic expressions and can b e compiled

as basic blo cks each signal value can b e represented as a variable and each op eration as a hardware

instruction The only optimizations left to the compiler are register allo cation memory layout and

instruction scheduling

As a language ALDiSP violates the assumptions b ehind the aforementioned compilation metho ds

The control ow is not made explicit but emerges from the interaction of data typ es vectors streams

overloaded functions and generic rules automapping overload resolution blo cking and forcing

ALDiSP has no static typ e system

The target architectures for which ALDiSP is intended do not provide large memories and runtime

garbage collection is out of the question

Chapter Conclusions and Outlook

ALDiSP supp orts data typ es and op erations on them that are not primitivemachine typ es and op era

tions whole arrays higherorder functions

The DSP algorithms to which ALDiSP is typically applied are however simple and highly regular The

idea b ehind ac was therefore to write a compiler that ltered this sp ecic regular structure out of ALDiSP

programs that are comp osed from very generic building blo cks Online partial evaluation promised to provide

a metho d for this

Development History

Early work on ac was guided by a pap er describing Fuse an online partial evaluator for Scheme The

rst attempts at writing ac were based on extensions to a nave ALDiSP interpreter written in Scheme

While Scheme is a ne language for exp erimental programming language extensions and metainterpreters

it lacks in some resp ects Esp ecially the need for static typ e checking motivated the complete rewrite into

SML

In parallel to the compiler a rst complete semantics was written An earlier attempt to write a direct

ALDiSP semantics didnt succeed since the semantics got to o complex Thus the Lambda Form was b orn

In retrosp ect the Lambda Form is stil l to o complex it mighthave b een b etter to transform ALDiSP directly

into what is now the Co de Form At that time I considered such a decomp osition but was afraid that the

intermediate co de size would explo de if the Lambda Form got to o basic

The basic idea of Fuse namely the graphbased representation was mostly lost in ac since ALDiSP features

such as the susp ension state and full higherorder functions are not considered byFuse and cause quite some

problems when a at program is to b e reconstructed from a graph The ma jor conceptual breakthrough

in the reconstruction phase consisted of the idea to separate the code and atom attributes and to use the

abstract state graph as the skeleton of the reconstructed program The Co de Form language changed a lot

over the time it was originally a strict subset of the Lambda Form

The current implementation of ac is not very stable and cannot b e used to compile large programs since

its resource consumption is to o high As a pro ofofconcept it shows that ALDiSP can b e compiled into

ecent co de It is probably imp ossible to sp eed up the compiler by a signicant amount without redesigning

it completely a sup ercombinatorbased compiler that has a p owerful abstract graph reduction machine as

its target cf section combined with a partial evaluator for said machine app ears to b e a viable

approach

Design Alternatives

There are some ma jor and many minor design alternatives that mighthavebeenchosen during the devel

opmentofthe ALDiSP compiler Some of them o ccurred to me to o late others involved to o many unknown

factors

Explicit State Passing

A central part of the current compiler is the abstract scheduler which tries to enumerate all p ossible states

the program under compilation can encounter If this scheduler fails no compilation is p ossible If it succeeds

garbage collection can b e done at compiletime and a xed memorylayout is guaranteed

Many real programs haveavery large state space or no static sc hedule at all ie an innite state space Programs that create dynamic data structures whose form dep ends up on the input fall under this category

Chapter Conclusions and Outlook

form here refers to the comparison function employed by the abstract scheduler that tells whether two

states can b e mapp ed together to have a similar form two data structures must have the same structure

and size

The current compiler do es not treat states as rstclass values and part of the domain hierarchy on which

the abstract interpreter is based As a consequence there are no fully abstract states only states that

may contain references with abstract denitions It mightbeinteresting to remove the abstract scheduler

as a separate entityby transforming the intermediate representation into a form that explicitly passes the

state around

Such an approachwould however need a more sophisticated lo op detection scheme since the abstract inter

preter would nowhave the additional task of creating the state automaton

ContinuationPassing Style

Early in the developmentofthe ALDiSP semantics and compiler I considered a continuationpassing style

CPS intermediate form Such a form go es b eyond explicit state passing in that it also makes control

ow explicit CPS is not based up on the function call stack paradigm of owcontrol but on explicit

continuation functions whichprovide return addresses and stack frames CPS would have simplied the

mo delling of exceptions to some degree I abandoned this approach b ecause I felt uneasy with an intermediate

representation that involves the heavy use of higherorder functions Abstract interpreters esp ecially the

lo op detection heuristics have problems with higherorder function arguments since the similarityof

such is hard to determine It is much easier to insp ect a classical call stack since terms like call depth

are naturally expressed in such a framework whereas they are in a CPS setting structural prop erties of

anonymous functions My exp eriments with CPS representation also generated intermediate programs that

were exceedingly hard to understand a prop erty that is not at all wished for in an exp erimental compiler

Compilation by Graph Reduction

An entirely dierent approach to the whole compilation pro cess is based up on the graph reduction mo del

Graph reduction is based up on an rewriting engine that reduces a program graph into a normal

form The graph is a representation of combinators any functional program can b e translated intoaset

of combinators by a pro cess called bracket abstraction Both program and data have a common form the

result of a program is its normal form Graph reduction is esp ecially suited for lazy languages since there

are no pure data no des the simplest no de is the quote or identity no de that corresp onds to the Lit of

LF A strict primitive function is reducible if all its input arcs are connected to these identity no des This

can b e exploited by lazy languages in which argument passing is done byname in ALDiSP terms every

expression is wrapp ed byadelay Due to this characteristic each access to a value must rst check its

valueness ie whether it has already b een forced or not By providing identity no des this information is

enco ded in the values no de typ e an unevaluated value is a nonidentitynode

per se lazy it do es share some prop erties with lazy languages This is due to the simi While ALDiSP is not

larity of susp ensions and promises ie to the blo cking and forcing The graphreduction corresp ondence

to blo cking is the inabilitytoevaluate a graph while its arguments are not yet in normal form the equivalent

to forcing is the standard action of graph reduction in graph reduction an arguments value is accessed by

forcing its no de The most imp ortant optimization for lazy languages is strictness analysis an argumenttoa

function is strict if its value is accessed in every p ossible evaluation of the function Strict arguments can b e

passed byvalue Finding strict function arguments amounts to nding connected groups of expressions

if any memb er of such a group has to b e evaluated all members havetobeevaluated In ALDiSP strictness

And if they are of the righttyp e all graphreduction systems I knowofwork only on statically welltyp ed programs but this is mostly for sp eedup

Chapter Conclusions and Outlook

analysis can b e used to force promises and more imp ortant to blo ck a function application when one of the

arguments is an unevaluated susp ension

Th alternative approachtoALDiSP compilation would use a new semantics built from quite a dierentsetof

primitives If the abstract machine provides for automatic blo cking and forcing the most onerous parts of

ALDiSP namely the details related to promises and susp ensions would b e delegated to the abstract machine

whichwould also provide for memory allo cation and garbage collection The compiler would consist of a

fairly standard typ e inference system or a primitive abstract interpretation that ignores delays and promises

and tries to nd the typ e of every function application so as to resolve automapping and overloading at

compile time The compiled program would b e a set of typ ecorrect sup ercombinators which could b e

translated onetoone to graph co de The graph reduction machine would corresp ond to one of the standard

architectures eg the Gmachine and its derivates which has to b e extended byan

IO and time manager that corresp onds to from the machines p ersp ective a nondeterministic evaluator

of susp ension no des

Such a graph reduction ALDiSP would havea very p o or p erformance when compared with the output of ac

esp ecially when memory consumption is considered The trickwould b e to p erform a partial evaluation of

the graph reduction engine and the program or parts of it In a graph context the problems of program

representation cf chapter would b e minimized also some optimizations due to partial evaluation could

b e applied even to those programs that cannot b e compiled by ac b ecause they have no static schedule

Redesigning ALDiSP

The exp eriences made during the implementationofthe ALDiSP compiler show ed up some weaknesses in the

language and provided the motivation for the design of a successor language AL AL is designed with

reasonably fast and simple compilation in mind it is describ ed in detail in Here is the list of ma jor

dierences to ALDiSP

AL was codesigned with its semanticswhich therefore is much simpler and smaller than ALDiSPs

AL allows static typingAnadvanced typ e system akin to Haskells type classes provides

most of the freedom available to ALDiSPsnumerical expressions while giving much b etter protection

against unreasonable typ e errors

A sp ecialized signal type is intro duced A signal is a sequence of data ob jects each of whichhas

asso ciated timing information Signals replace ALDiSPs streams and pip es A set of synchronization

primitives on signals replace most uses of the suspend and delay constructs Most of the runtime

complexity asso ciated with promises and susp ensions essentially vanishes promise returning and

susp ension returning b ecome compiletime typ es

AL is sp ecied as a core language with extensions The core language can b e implemented without

any sophisticated compilation techniques most of the extensions can b e realized as prepro cessors that

generate core language programs

Amongst the extensions are the module system and the macroprocessor The macros can b e typ ed

As an extension to the typ e system equational assertions replace the predicate typ es of ALDiSP

Once a working AL compiler exists it mightbeinteresting to write a converter that transforms old ALDiSP

programs into AL

Bibliography

S Abramsky C Hankin eds Abstract Interpretation of Declarative Languages Ellis Norwood Ltd

ACM Proc Third Workshop on the Mathematical Foundations of Programming Languages Semantics

Springer LNCS April

F Allen J Co cke Acatalogue of optimizing transformations in R Rustin ed Design and Opti

mization of Compilers PrenticeHall

A W App el Compiling with ContinuationsCambridge University Press

A W App el T Jim Continuationpassing closurepassing style in pp

G Argo Improving the Three Instruction Machine in

J Armstrong R Virding M Williams Concurrent Programming in Erlang PrenticeHall

JL Armstrong BO Dacker SR Virding MC Williams Implementing a Functional Language for

Highly Paral lel Real Time Applications published in SETSS Florence available via

ftpeuagateeuaericssonsepubeuaerlanginfoimplempsZ

EA Ashcroft Lucid a Nonprocedural Language with Iter ation in CACM No July pp

L Beckman A Haraldson O Oskarsson E Sandewall A Partial Evaluator and its Use as a Program

ming Tool in Articial Intelligence Journal pp

P Bellot Graal A functional programming system with uncurryedcombinators and its reduction

machine in

A Benveniste P Bournai T Gautier P Le Guernic SIGNAL A Data Flow OrientedLanguage for

Signal Processing INRIA Rapp orts de Recherche No INRIA Centre de Rennes IRISA

A Berlin A Compilation Strategy for Numerical Programs Based on Partial EvaluationMITTechnical

Rep ort AITR pp MIT AI Lab

A Berlin D Weise Compiling Scientic Code Using Partial Evaluation in IEEE Computer Vol

No Dec pp pp

V Berzins Semantics of a RealTime Language in Pro c IEEE Symp on RealTime Systems Dec

pp

V Berzins ExecutionofaHighLevel RealTime Language in Pro c IEEE Symp on RealTime Sys

tems Dec pp

BIBLIOGRAPHY

D Bjrner APErshov ND Jones Partial Evaluation and Mixed Computation Pro c of the IFIP

TC Workshop on Partial Evaluation and Mixed Computation Gammel Averns Denmark Oct

NorthHolland

A Bloss Update Analysis and the Ecient Implementation of Functional Aggregates in

A Bloss Path Analysis and the Optimization of NonStrict Functional Languages PhD thesis Yale

University Dept of Computer Science Available as research rep ort YALEUDCSRR

A Bloss P Hudak Path Semantics in

A Bloss P Hudak J Young An optimizing compiler for a modern functional langauge in The

Computer JournalVol No pp

RM Burstall J Darlington Some transformations for developing recursive programs in

RM Burstall J Darlington Atransformation system for developing recursive programs in Journal

of the ACM pp January

L Cardelli The Amber Machine in pp

G Cousineau PL Curien B Robinet Combinators and Functional Programming Springer LNCS

P Caspi N Halbwachs D Pilaud JA Plaice LUSTRE a declarative language for programming

synchronous systems in and as Rep ort L Pro ject SPECTRE Group e Sp eecication et Analyse

des Systemes Lab oratoire de Genie Informatique de Grenoble

W Clinger J Rees eds Revised Report on the Algorithmic Language Scheme in SIGPLAN Notices

Dec and as MIT AI Memo a Sept

W Clinger J Rees eds Revised Report on the Algorithmic Language SchemeUniversity of Oregon

Technical Rep ort CISTR

C Consel O Danvy From Interpreting to Compiling Binding Times in

C Consel SC Kho o Parameterized Partial EvaluationinACM Trans on Programming Languages

and Systems Vol No July pp

J Darlington A semantic approach to automatic program improvement Exp erimental Programming

Rep orts No Scho ol of Articial Intelligence University of Edinburgh

J Darlington RM Burstall A system which automatical ly improves programs in Pro c of the Third

International Joint Conference on Articial Intelligence Stanford Calif

AP Ershov VV Grushetsky An ImplementationOriented Method for Describing Algorithmic Lan

guages in B Gilchrist ed Information Pro cessing Toronto Canada NorthHolland pp

B Robinet R Williams eds ESOP st European SymposiumonProgramming Springer LNCS

EN Jones ed ESOP rdEuropean Symposium on Programming Springer LNCS

B KriegBruc kner ed ESOP th European Symposium on Programming Rennes France Febru

ary Springer LNCS

J Fairbairn S Wray Tim A Simple Lazy Abstract Machine to Execute Supercombinators in pp

BIBLIOGRAPHY

A Fauth M Freericks A Knoll Generation of Hardware Machine Models from Instruction Set De

scriptions in VLSI Signal Pro cessing VI Eggermont etal eds IEEE Signal Pro cessing So ciety

A Fauth The CBC Compiler Generator The Final ReportForschungsb erichte des Fachbereichs In

formatik Nr TU Berlin

C Flanagan A Sabry BF Duba M Felleisen The Essence of Compiling with Continuations in

M Freericks Entwurf und Spezikation einer funktionalen Programmiersprache fur die speziel len Er

fordernisse der digitalen Signalverarbeitung Diplomarb eit TU Berlin

M Freericks A Knoll ALDiSP eine applikative Programmiersprache fur Anwendungen in der digitalen

SignalverarbeitungForschungsb erichte des Fachb ereichs Informatik Nr TU Berlin

M Freericks The nML Machine Description FormalismForschungsb erichte des Fachb ereichs Infor

matik Nr TU Berlin

M Freericks The nML Machine Description Formalism update dVersion in ESPRITI I Pro ject

SPRITE Progress Rep ort for Perio d June Novemb er Rep ort No PR Dec

Editor Patrick Pyp e

M Freericks A Knoll L Do oley The RealTime Programming Language ALDiSP Informal Intro

duction and Formal SemanticsForschungsb erichte des Fachb ereichs Informatik Nr

M Freericks A Fauth A Knoll A Basic Semantics for Computer ArithmeticTechnischer Bericht

Technische Fakultat Universitat Bielefeld

M Freericks A Knoll The AL Signal Processing Language A Successor to ALDiSPTechnischer

Bericht Technische Fakultat Universitat Bielefeld to app ear

Daniel PFriedman Mitchell Wand Christopher T Haynes Essentials of Programming Languages

MIT Press

chitecture NancyFrance JP Jouannaud ed Functional Programming Languages and Computer Ar

Springer LNCS

Proc IFIP Symposium on Functional Programming Languages and Computer ArchitecturePortland

Oregon September

ACM SIGPLANSIGACT IFIP Functional Programming Languages and Computer ArchitectureIm

p erial College London Sept

Y Futamura Partial Evaluation of Computation Process An Approach to a CompilerCompilerin

Systems Computers Controls pp

Y Futamura K Nogi Generalized Partial Computation in pp

T Gautier P Le Guernic O Maes For a New RealTime Methodology IRISA Publication Interne

No Octob er

N Halbwachs D Pilaud F Ouab desselam AC Glory Specifying Programming and Verifying Real

Time Systems using a Synchronous Declarative Language Rep ort L Pro ject SPECTRE Group e

Sp eecication et Analyse des Systemes Lab oratoire de G enie Informatique de Grenoble

A Haraldsson Aprogram manipulation system basedonpartial evaluation PhD thesis Linkoping

UniveritySweden Linkoping Studies in Science and Technology Dissertations

BIBLIOGRAPHY

PG Harrison Function Inversion in pp

P Hudak SP Jones PWadler eds Report on the Programming Language Haskel l version st

March available via ftp from numerous sites

PHudak A semantic model of referencecounting and its abstraction in

J Hughes Analysing Strictness by Abstract Interpretation of Continuations in

J Hughes Backwards Analysis of Functional Programs in pp

Institute of Electrical and Electronics Engineers Inc Draft Standard for the Scheme Programming

Language PD March

TP Jensen T Mogensen A Backwards Analysis for CompileTime Garbage Col lection in

pp

T Johnsson Lambda Lifting Transforming Programs to Recursive Equationsin pp

S L Peyton Jones J Salkind The Spineless Tagless GMachine in

SP Jones and C Clack Finding xpoints in abstract interpr etation in pp

Neil D Jones Carsten K Gomard Peter Sestoft Partial Evaluation and Automatic Program Genera

tionPrenticeHall

D Kepp el S J Eggers R R Henry ACaseforRuntime Code GenerationUniversityofWashington

Department of Computer Science and Engineering UWCSE November

RB Kieburtz The Gmachine A fast graphreduction evaluator in pp

CD Klo os STREAM A Scheme Language for Formal ly Describing Digital Circuits in PARLE

Parallel Architectures and Languages Europ e Vol I I Parallel Languages Eindhoven The Netherlands

June

Alois Knoll Markus Freericks Anapplicative realtime language for DSPprogramming supporting asyn

chronous dataow concepts in Micropro cessing and Microprogramming Vol No August

Pro ceedings Euromicro pp

A Knoll A Schweikard M Freericks Eine datenuorientierte funktionale Programmiersprache fur

die Echtzeitdatenverarbeitung in Prozerechensysteme Pro ceedings Belin Februar Springer

InformatikFachb erichte

Alois Knoll Rup ert C Nieb erle CADiSP A Graphical Compiler for the Programming of DSP in a

Completely Symbolic Way Pro c Int Conf on Acoustics Sp eech and Signal Pro cessing April

EE Kohlb ecker Jr Syntactic Extensions in the Programming Language Lisp PhD thesis Indiana

University August

D Kranz R Kelsey J Rees P Hudak J Philbin and N Adams ORBIT An optimizing compiler for

Scheme Pro c Sigplan Symp on Compiler Construction published as SIGPLAN Notices

pp July

V Kruckemeyer A Knoll Eine imp erative Sprache zur Programmierung digitaler Signalprozessoren

Forschungsb erichte des Fachb ereichs Informatik Nr TU Berlin

L Lamp ort What It Means for a Concurrent Program to Satisfy a Specication Why No One Has

Specied Priority in pp

BIBLIOGRAPHY

P Lee Realistic Compiler GenerationCambridge MA MIT Press

Proc of the Conference on Lisp and Functional Programming Nice France June

F Lohr A Fauth M Freericks SIGHSIM An Environment for Retargetable Instruction Set Simu

lationForschungsb erichte des Fachbereichs Informatik Nr TU Berlin

LA LombardiB Raphael Lisp as the Language for an Incremental Computer in EC Berkeley and

DG Bobrow eds The Programming Language Lisp Its Operation and Applications MIT Press

Cambridge Massachusetts pp

DB Loveman Program improvements by sourcetosourcetransformationsin

J McCarthyPW Abrahams DJ Edwards TP Hart MI Levin LISP Programmers Manual

The Computation Center and Research Lab oratory of Electronics Massachusetts Institute of Technol

ogy MIT Press

J McGraw Paral lel F unctional Programming in Sisal Fictions Facts and Future UCRLLC

June

R McConnell Prototyping of VLSI Components from a Formal Specication IRISA Publication Interne

No Septembre

U Meyer Techniques for Partial Evaluation of Imperative Languages in pp

R Milner M Tofte RHarp er The Denition of StandardML MIT Press

R Milner M Tofte Commentary on StandardML MIT Press

T Mogensen Binding time aspects of partial evaluation PhD thesis DIKU University of Cop enhagen

Denmark March

Jo el Moses The Function of FUNCTION in LISP or Why the FUNARGProblem Should beCalled

the Environment Problem in nd Symp osium on Symb olic and Algebraic Manipulation Los Angeles

published in SIGSAM Bulletin No July pp

Hanne Riis Nielson Flemming Nielson Bounded Fixed Point Iteration ExtendedAbstract in

pp

V Nirkhe W Pugh Partial Evaluation of HighLevel Imperative Programming Languages with Appli

cations in HardRealTime Systems in pp

Alan V Opp enheim Ronald W Schafer DiscreteTime Signal ProcessingPrenticeHall

ACM rdACM Symposium on Principles of Programming Languages POPLAtlanta Georgia

January

ACM th ACM Symposium on Principles of Programming Languages POPL New Orleans Lou

siana January

ACM th ACM Symposium on Principles of Programming Languages POPL Munich Germany

January

ACM th Conf on Principles of Programming Languages POPL Austin Texas January

ACM th Conf on Principles of Programming Languages POPL Albuquerque New Mexico

January

BIBLIOGRAPHY

ACM st Conf on Principles of Programming Languages POPLPortland Oregon January

ACM Programming Language Design and Implementations PLDI Albuquerqe NM June

published as SIGPLAN Notices VolNoJune

H John Reekie Realtime Signal Processing Dataow Visual and Functional ProgrammingPhD

thesis Scho ol of Electrical Engineering UniversityofTechnology Sydney

E Ruf D Weise Opportunities for online partial evaluation Stanford University California April

technical rep ort CSLTR

E Ruf D Weise Preserving Information during Onling Partial Evaluation Stanford University Cal

ifornia April technical rep ort CSLTR

E Ruf Topics in online partial evaluation PhD thesis Stanford University California February

published as technical rep ort CSLTR

PB Schenk E Angel AFORTRAN to FORTRAN optimizing compiler in The Computer Journal

Vol No

D A Schmidt Denotational Semantics Allyn and Bacon Inc

E Scho en The CAOS System Stanford University Rep ort No STANCS

A Schwarte H Hanselmann The Pro gramming Language DSPL PCIM Munc hen

P Sestoft Replacing Function Parameters by Global Variables in pp

EDCSilage Reference Manual Europ ean DevelopmentCenter

Silage Users and Reference Manual prepared byMentor GraphicsEDC June describ es version

Guy L Steele jr Rabbit a compiler for SchemeTech Rep ort AITR MIT Cambridge

Symposium on Partial Evaluation and SemanticsBasedProgram ManipulationYale Univ June

published as SIGPLAN Notices Vol No Sept

VF Turchin The concept of a supercompilerinACM Transactions on Programming Languages and

Systems July pp

VF Turchin The algorithm of generalization in the supercompiler in pp

DA Turner A New Implementation Technique for Applicative LanguagesSoftwarePractice and

Exp erience Vol pp

PWadler Comprehending Monads LFP pp

PWadler J Hughes Pr ojections for Strictness Analysis in

PWadler S Blott How to make adhocpolymoprhism less adhoc in

D Weise Graphs as Intermediate Representations for Partial Evaluation Stanford University Cali

fornia March technical rep ort CSLTR

D Weise R Conyb eare R Ruf S Seligman Automatic online partial evaluation in Symp osium on

Partial Evaluation and SemanticsBased Program Manipulation New Haven Conn June pp

BIBLIOGRAPHY

D Weise E Ruf Computing types during program specialization Stanford University California

August technical rep ort CSLTR

D Weise RF Crew M Ernst B Steensgaard Value DepenceGraphs Representation without Tax

ation Microsoft Research available via ftpresearchmicrosoftcompubpapersvdgps

revised version of a pap er that app eared in pp

W Wulf RK Johnson CB Weinsto ck SO Hobbs CM Geschke The Design of an Optimizing

Compiler American Elsevier

Index

is

abstract

reference counts

CADiSP

CPS

details

implementation sp ecic

DSP

applications of

kernel

of ALDiSP

op en language

why ALDiSP is one

order of execution

piece de resistance

predened

Obj

false

is

true

cast

checkfunctype

closure

delay

is

overload

return

suspend

testfunctype

tupel

unit

static

memory layout

tail recursivity