HDirect A Binary Foreign Language Interface for Haskell

Sigb jorn Finne Daan Leijen Erik Meijer Simon Peyton Jones

April

HDirect provides the means to leverage that primitive fa Abstract

cility into the full glory of IDL

HDirect is a foreignlanguage interface for the purely func

Because they cater for a variety of languages foreign

tional language Haskel l Rather than rely on hostlanguage

language interfaces tend to b ecome rich complex incom

type signatures HDirect compiles Interface Denition Lan

plete and describ ed only by example The main contribu

guage IDL to Haskel l stub code that marshals data across

tion of this pap er is to provide part of a formal descrip

the interface This approach al lows Haskel l to call both

tion of the interface This precision encompases not only

and COM and al lows a Haskel l component to be wrapped

the programmerseye view of the interface but also its im

in a C or COM interface IDL is a complex language and

plementation The bulk of the pap er is taken up with this

language mappings for IDL are usual ly described informal ly

description

In contrast we provide a relatively formal and precise de

nition of the mapping between Haskel l and IDL

Background

This paper has been submitted to the International Confer

ence on Functional Programming ICFP

The basic way in which almost any foreignlanguage inter

face works is this The signature of each foreignlanguage

pro cedure is expressed in some formal notation From this Introduction

signature stub co de is generated that marshals the param

eters across the b order b etween the two languages calls

A foreignlanguage interface provides a way for programs

the pro cedure using the foreign languages calling conven

written in one language to call or b e called by programs

tion and then unmarshals the results back across the b or

written in another Programming languages that do not sup

der Dealing with the dierent calling conventions of the two

ply a foreignlanguage interface die a slow lingering death

languages is usually the easy bit The complications come

go o languages die more slowly than bad ones but they

in the parameter marshalling which transforms data values

all die in the end

built by one language into a form that is comprehensible to

In this pap er we describ e a new foreignlanguage for the

the other

functional Haskell In contrast to

A ma jor design decision is the choice of notation in which

earlier foreignlanguage interfaces for Haskell such as Green

to describ e the signatures of the pro cedures that are to b e

Card we describ e a design based on a standard Interface

called across the interface There are three main p ossibili

Denition Language IDL We discuss the reasons for this

ties

decision in Section

Our interface provides direct access to libraries written in

Use the host language Haskel l in our case That

C or any other language using Cs

is write a Haskell type signature for the foreign func

and makes it p ossible to write Haskell pro cedures that can

tion and generate the stub co de from it Green Card

b e called from C The same to ol also makes it allows us

uses this approach as do es JDirect Microsofts

to call COM comp onents directly from Haskell or to

foreignlanguage interface for Java

seal up Haskell programs as a COM comp onent COM is

Microsofts comp onent ob ject mo del it oers a language

Use the foreign language say C In this case the stub

indep endent interface standard b etween software comp o

co de must b e generated from the C prototype for the

nents The interfaces of these comp onents are written in

pro cedure SWIG uses this approach

IDL

Use a separate Interface Denition Language IDL

HDirect generates Haskell stub co de from IDL interface

designed sp ecically for the purp ose

descriptions It is carefully designed to b e indep endent of

the particular Haskell implementation To maintain this in

We discuss the rst two p ossibilities in Section and the

dep endence HDirect requires the implementation to sup

third in Section

p ort a primitive foreignlanguage interface mechanism ex

pressed using a nonstandard Haskell foreign declaration

Using the host or foreign language The signature of a foreign pro cedure may say to o little

ab out allo cation resp onsibilities For example if the

At rst sight the rst two options lo ok much more conve

caller passes a data structure to the callee such as a

nient than the third b ecause the caller is written in one

string can the latter assume that the structure will

language and the callee in the other so the interface is con

still b e available after the call Do es the caller or callee

veniently expressed for at least one of them Here for exam

allo cate space to hold the results

ple is how JDirect allows Java to make foreignlanguage

In an earlier pap er we describ ed Green Card whose basic

calls

approach was to use Haskell as the language in which to give

the type signatures for foreign pro cedures To deal with

class ShowMsgBox

the issues describ ed ab ove we provided ways of augmenting

public static void mainString args

the Haskell type signature to allow the programmer to cus

tomise the stub co de that would b e generated However

MessageBoxHelloJava Messagebox

Green Card grew larger and larger and we realised that

what b egan as a mo dest design was turning into a fullscale

language

dllimportUSER

private static native

int MessageBox int hwndOwner String text

Using an IDL

String title int fuStyle

Of course we are not the rst to encounter these diculties

The standard solution is to use a separate Interface Deni

tion Language IDL to describ e the signatures of pro ce

The dllimport directive tells the that the

dures that are to b e called across the b order IDLs are rich

Java MessageBox metho d will link to the native Windows

and complicated for precisely the reasons describ ed ab ove

USERDLL The parameter marshaling for example of the

but they are at least somewhat standardised and come with

strings is generated based on the Java type signature for

useful to ols We fo cus on the IDL used to describ e COM

MessageBox

interfaces which is closely based on DCE IDL An

The fatal aw is that it is invariably impossible in general

other p opular IDL dialect is the one dened by OMG as part

to generate adequate stub code based solely on the type sig

of the CORBA sp ecication and we intend to provide

nature of a procedure in one language or the other There

supp ort for this using the translation from OMG to DCE

are three kinds of diculties

IDL dened by

1

Like COM but unlike CORBA we take the view that the

First some practicallyimp ortant languages notably

IDL for a foreign pro cedure denes a languageindependent

C have a type system that is to o weak to express the

binary interface to the foreign procedure a sort of lin

necessary distinctions For example

gua franca The interface thus dened is supp osed to b e

The stub co de generator must know the mo de of

complete it covers calling convention data format and al

each parameter in in out or out b ecause

lo cation rules It may b e necessary to generate stub co de

each mo de demands dierent marshaling co de

on b oth sides of the b order to marshal parameters into

the IDLmandated format and then on into the format de

Some p ointers have a signicant NULL value while

manded by the foreign pro cedure But these two chunks

others do not Some p ointers p oint to values that

of marshaling co de can b e generated separately each by a

can and sometimes should b e copied across the

to ol sp ecialised to its host language By design however

b order while others refer to mutable lo cations

IDLs binary conventions are more or less identical to Cs

whose contents must not b e copied

so marshaling on the C side is hardly ever necessary

There may b e imp ortant interrelationships b e

Here for example is the IDL desribing the interface to a

tween the parameters For example one param

function foo

eter might p oint to an array of values while an

other gives the number of elements in the array

int foo out long l

The marshaling co de needs to know ab out such

string in char s

dep endencies

in out double d

On the other hand it may not even b e enough to give

the signature in a language with an expressive type

The parts in square brackets are called attributes In this case

system such as Haskell The trouble is that the type

they describ e the mo de of each parameter but there are a

signature still says to o little ab out the foreign pro ce

rich set of further attributes that give further and often

dures type signature For example is the result of a

essential information ab out the type of the parameters For

Haskell pro cedure returned as the result of the foreign

example the string attribute tells that the parameter s

pro cedure or via an out parameter of that pro cedure

p oints to a nullterminated array of characters rather than

In the case of JDirect when a record is passed as an

p ointing to a single character

argument Javas type signature is not enough to sp ec

1

CORBA do es not dene a binary interface Rather each ORB

ify the layout of the record b ecause Java do es not sp ec

vendor provides a language binding for a number of supp orted lan

ify the layout of the elds of an ob ject and the garbage

guages This language binding essentially provides the marshaling

required to an ORBsp ecic common calling convention If you want

collector can move the ob ject around in memory

to use a language that the ORB vendor do es not supp ort you are out

of luck

do a marshalPoint p

primMove a

r unmarshalPoint a hdFree

Application

return r

foreign import stdcall Move

Ptr Point IO H/Direct primMove

M.idlM.hs HDirect.hs

This co de illustrates the following features

For each IDL declaration HDirect generates one or

M.c

more Haskell declarations

From the IDL pro cedure declaration Move HDirect

generates a Haskell function move whose signature is

Figure The big picture

intended to b e what the user would exp ect In par

ticular the Haskell type signature is expressed using

highlevel types that is Haskell equivalents of the

Overview

IDL types For example the signature for move uses

the Haskell record type Point The translation for a

The big picture is given by Figure The interface b e

pro cedure declaration is discussed in Section

tween Haskell and the foreign language is sp ecied in IDL

This IDL sp ecication is read by HDirect which then pro

The b o dy of the pro cedure marshals the parameters

2

duces Haskell and C source les les containing Haskell and

into their lowlevel types b efore calling the low

C stub code

level Haskell function primMove The latter is dened

using a foreign declaration the Haskell implementa

HDirect can generate stub co de that allows Haskell to call

tion generates co de for the call to the C pro cedure

C or C to call Haskell It can also generate stub co de

Move Section sp ecies the highlevel and lowlevel

that allows Haskell to create and invoke COM comp onents

type corresp onding to each IDL type

and that allows COM comp onents to b e written in Haskell

Much of the work in all four cases concerns the marshal

A lowlevel type is still a p erfectly rstclass Haskell

ing of data b etween C and Haskell and that is what we

type but it has the prop erty that it can trivially b e

concentrate in this pap er

marshalled across the b order There is xed set of

primitive lowlevel types including Int Float Char

Since HDirect generates Haskell source co de how do es

and so on Addr is a lowlevel type that holds a raw

it express the actual foreignlanguage call or entry for

machine address The type constructor Ptr is just a

the inverse case We have extended Haskell with a

synonym for Addr

foreign declaration that asks the Haskell implementation

to generate co de for a foreignlanguage call or entry

The foreign declaration deals with the most primi

type Ptr a Addr

tive layer of marshaling which is necessarily implementa

addPtr Ptr a Int Ptr b

tion dep endent HDirect generates all the implementation

indep endent marshaling

The type argument to Ptr is used simply to allow

HDirect to do cument its output somewhat by giv

To make all this concrete supp ose we have the following

ing the highlevel type that was marshalled into that

IDL interface sp ecication

Addr Section describ es how highlevel types are mar

shalled to and from their lowlevel equivalents

typedef struct int xy Point

From an IDL typedef declaration HDirect generates

void Move inoutref Point p

a corresp onding Haskell type declaration together with

some marshalling functions In general a marshalling

If asked to generate stub co de to enable Haskell to call func

function transforms a highlevel Haskell value in

tion Move HDirect will generate the following Haskell

this case Point into a lowlevel Haskell value in

co de

this case Ptr Point These marshalling functions are

in the IO monad b ecause as we shall see they often

data Point Point xyInt

work often imp eratively by allo cating some memory

marshalPoint Point IO Ptr Point

and explicitly lling it in so as to construct a memory

marshalPoint

layout that matches the interface sp ecication The

translations for typedef declarations are discussed in

unmarshalPoint Ptr Point IO Point

Section

unmarshalPoint

The function hdFree IO simply releases all the

move Point IO Point

memory allo cated by the marshalling functions

move p

2

So much for our example The diculty is that IDL is a com

For the sake of deniteness we concentrate on C as the foreign

language in this pap er

plex language so it is not always straightforward to guess

the Haskell type that will corresp ond to a particular IDL

t b basic ty pe

type nor to generate correct marshalling co de The former

j n ty pe names

is imp ortant to the programmer the latter only to HDirect

j fattr g t pointer ty pe

itself Our goal in this pap er is to give a systematic trans

lation of IDL to Haskell stub co de

attr unique j ref j ptr

To simplify translation we assume that the IDL source is

j string j size ise

brought into a standard form that is we factor the trans

lation into a translation of full IDL to a core subset and

a translation from core IDL to Haskell In particular we

assume that out parameters always have an explicit

Figure IDL type syntax

the p ointer default is manifested in all p ointer types and

all enumeration have value elds The details are unimp or

The translation scheme T t gives the highlevel

tant

Haskell type corresp onding to the IDL type t

IDL is a large language and space precludes giving a com

plete translation here We do not even give a syntax for

The translation scheme N n do es the

IDL relying on the lefthand sides of the translation rules

required to translate IDL identiers to valid Haskell

to sp ecify the syntax we treat However the framework we

identiers For instance to account for the fact that

give here is sucient to treat the whole language and our

Haskell function names must b egin with a lowercase

implementation do es so

letter

The translation scheme B t gives the lowlevel

Haskell type corresp onding to the IDL type t

Pro cedure declarations

The translation scheme Mt T t IO B t

The translation function D maps an IDL declaration into

generates Haskell co de that marshals a value of IDL

one or more Haskell declarations We b egin with IDL pro ce

type t from its highlevel type T t to its lowlevel form

dure declarations To start with we concentrate on allowing

B t This is used to marshal all the inparameters of

Haskell to call C we discuss other variants in Section Here

the pro cedure in and inout

is the translation rule for pro cedure declarations

The translation scheme U t B t IO T t

D t res f int in outt out inoutt inout

generates Haskell co de that unmarshals a value of

IDL type t This is used to unmarshal all the out

in T t inout T f T t

parameters of the pro cedure and its result if any

IO T t out T t inout T t res

M and U are mutual inverses upto memory al

N f m n

lo cation

do f a Mt in m

In addition for out parameters the caller is required

out b O t

to allo cate a lo cation to hold the result O t IO

inout n c M t

Ptr B t is Haskell co de that allo cates enough space

r primN f a b c

to contain a value of IDL type t

x U t out b

inout c y U t

z U t res r

We will dene these functions in detail in Section but rst

hdFree

we deal with type declarations

return xyz

g

Mapping for types

foreign import stdcall primN f

in B t out B t inout B t

Next we turn our attention to the translations T and

IO B t res

B that translate IDL types to Haskell types which are

given in Figure

Despite our claim of formality the fully formal version of

this rule has an inconvenient number of subscripts Instead

Translating base types which have direct Haskell analogues

we illustrate by giving one parameter of each mo de in

is easy The highlevel and lowlevel type translations coin

out and in out more complex cases are handled ex

cide except that the highlevel representation of IDLs bit

actly analogously The translation pro duces a Haskell func

characters is Haskells bit Char type To give more pre

tion that takes one argument for each IDL in or in

cise mapping we have extended Haskell with new base types

out parameter and returns one result of each IDL out or

Word Word and so on Similarly IDL type names are

in out parameter plus one result for the IDL result if

translated to the Haskellmangled name of the corresp ond

any In general foreign functions can p erform side eects

ing Haskell type

so the result type is in the IO monad We are considering

Matters start to get murkier when we meet p ointers Since a

adding a non standard attribute pure that declares the

p ointer is always passed to an from from C as a machine ad

pro cedure to have no side eects in this case the Haskell

dress the lowlevel translation of all p ointer types is simply

pro cedure can simply return a tuple rather than an IO type

a raw machine address

The generic translation for pro cedure declaration uses sev

B t Ptr T t eral auxiliary translation schemes

A value of type stringchar is the address of

B short Int

a nullterminated sequence of characters Contrast

B unsigned short Word

refchar which is the address of a single character

B float Float

The corresp onding Haskell type is of course String

B double Double

The string attribute applies to the following array

B char Word

types char byte unsigned short unsigned long

B wchar Char

structs with byte only elds and in Microsoft

B boolean Bool

only IDL wchar

B void

Sometimes a pro cedure takes a parameter that is a

B attr t Ptr T t

p ointer to an array of values where another parame

ter of the pro cedure gives the size of the array For

example

T char Char

T b B b

void DrawPolygon

T n N n

insizeisnPoints Point points

T reft T t

in int nPoints

T uniquet Maybe T t

T ptrt Ptr T t

T stringchar String

T size isv t T t isnPoints attribute tells that the sec The size

ond parameter nPoints gives the size of the array

This is quite like the string case except that the

size of the array is given separately whereas strings

Figure Type translations

have a sentinel at the end We translate arrays to

Haskell lists

Recall that Ptr t is just an abbreviation for Addr but the

While each of these variants has a reasonable rationale we

Ptr form is somewhat more informative

have found the plethora of IDL p ointer types to b e a rich

In contrast the highlevel translation of p ointers dep ends on

source of confusion The translations in Figure lo ok in

what type of p ointer is concerned IDL has no fewer than

no cuous enough but we have found them extremely helpful

ve kinds of p ointer distinguished by their attributes We

in clarifying and formalising just exactly what the transla

treat them one at a time refer in each case to Figure

tion of an IDL type should b e

Even if the translation are not quite right whatever that

A value of IDL type reft is the unique p ointer

means we now have a language in which to discuss vari

or indirection to a value of type t Since p ointers

ants For example it may eventually turn out that the IDL

are implicit in Haskell the corresp onding highlevel

ptr attribute is conventionally used for subtly dierent

Haskell type is just T t

purp oses than the ones we suggest ab ove If so the transla

tions can readily b e changed and the changes explained to

The IDL type uniquet is exactly the same as

programmers in a precise way

reft except that the p ointer can b e NULL The

natural way to represent this p ossibility in Haskell is

using the Maybe type The latter is a standard Haskell

Marshalling

type dened like this

data Maybe a Nothing Just a

In the translation of the IDL type signature for a pro ce

dure Section we invoked marshalling functions M

and U for each of the types involved Now that we have

An IDL value of type ptrt is the address of a value

dened the high and lowlevel translations of each type the

that might b e shared and might contain cycles It is

marshalling co de is relatively easy to dene In this sec

far from clear how such a thing should b e marshalled

tion we dene these marshalling functions Lack of space

so we adopt a simple convention

precludes us from giving complete details so we will concen

T ptrt Ptr T t

trate mostly on marshalling basic types

Marshalling a structured value consists as we shall see

That is ptr values are not moved across the b order

of two steps allo cate some memory in the parameter

at all Instead they are represented by a value of type

marshal ling area to hold the value and then actually mar

Ptr T t a raw machine address

shal the Haskell value into that memory The translations

This is often useful For a start some libraries im

are much more elegant it we dene auxiliary schemes W

plement an abstract data type in which the client is

and R that p erform this byreference marshalling

exp ected to manipulate only p ointers to the values

We also need a number of functions to manipulate the

Similarly COM interface p ointers should b e treated

parametermarshalling area More precisely

simply as addresses Finally some op erating system

pro cedures notably those concerned with windows

W t Ptr T t T t IO marshals its second

return such huge structures that a client might want

argument into the memory lo cations p ointed to by

to marshal them back selectively

its rst argument the latter is a raw machine address

R t Ptr T t IO T t unmarshals a value of IDL

type t out of memory lo cations p ointed to by its

argument W and R are mutually inverse upto

memory allo cation

S t Int is the number of bytes o ccupied by an IDL

value of type t The function O mentioned in Sec

tion is dened thus

M t T t IO B t

O attr t hdAlloc S t

M char marshallChar

hdAlloc Int IO Ptr a allo cates the sp ecied

M b return

number of bytes in the parametermarshalling area

M n marshalln

returning a p ointer to the allo cated area

M reft x

dof px hdAlloc S t

hdWriteb Ptr T b T b IO , where b is a

W t px xg

basic type marshals a value of IDL type b into the

M uniquet x

sp ecied memory lo cations

case x of

Nothing return nullPtr

hdReadb Ptr T b IO T b , where b is a basic

Just y M reft y

type unmarshals a value of IDL type t

M ptrt return

hdFree IO frees the whole parametermarshalling

M stringt marshallString

area

With these denitions in mind Figure gives the mar

W t Ptr T t T t IO

is b e shalling schemes We omit the schemes for size

cause it is tiresomely complicated Apart from that the

W b hdWriteb

translations are easy to read

W attr t p x

dof a Mattr t x

For basic types there is no marshalling to do except hdWriteAddr p ag

that we must convert b etween the bit Haskell Char

and bit IDL char types

U t B t IO T t

Marshalling a typedefd type can b e done by invoking

its marshalling function

U char unmarshallChar

U b return

Marshalling a ref p ointer is done by allo cating some

U n unmarshalln

memory with hdAlloc and then marshalling the value

U reft R t

into it with W Unmarshalling is similar except

U uniquet p

that there is no allo cation step we just invoke R

if p nullPtr then

return Nothing

Dealing with unique p ointers is similar except that

else

we have to take account of the p ossibility of a NULL

dof x R t p

value

return Just xg

U ptrt return

Again it is very helpful to have a precise language in which

U stringt unmarshallString

to discuss these translations Though they lo ok simple we

can attest that it is very easy to get confused by p ointers

to p ointers to things and we have far greater condence in

R t Ptr T t IO T t

our implementation as a result of writing the translations

formally

R b hdReadb

R attr t p

dof a hdReadAddr p

Type declarations

U attr t ag

On top of the primitive base types IDL supp orts the de

nition of a number of constructed types For example

Figure The marshalling schemes

typedef int trip

typedef struct TagPoint int xy Point

typedef enum Red Blue Green RGB

typedef union floats switch int ftype

case float f

case double d

Floats

type Year Int

t te ar r ay ty pe

plus marshalling functions for Year

j enum f

For a record type such as Point tag v tag v g enumer ation

1 1 n n

j struct tag f

f t f t g r ecor d ty pe

typedef struct TagPoint int xy Point

1 1 n n

j union tag

1

switch b tag f

generates a single constructor Haskell data type

2

case v t f case v t f g union ty pe

1 1 1 n n n

data Point TagPoint x Int yInt

In addition to this the D scheme generates

Figure IDL constructed type syntax

a collection of marshalling functions including

marshallPoint

which declares array record enumeration and union or

marshallPoint Point IO Ptr Point

sum types resp ectively Figure shows the syntax of IDLs

marshallPoint Point x y

constructed types

do ptr hdAlloc sizeofPoint

let ptr addPtr ptr

The translation provides rules for converting b etween IDL

marshallintAt ptr x

constructed types into corresp onding Haskell representa

let ptr addPtr ptr sizeofint

tions To ease the task of dening this type mapping we

marshallintAt ptr y

assume that each constructed type app ears as part of an

return ptr

IDL type declaration In general a type declaration has the

following form

typedef t name

It marshals a Point by allo cating enough memory to

hold the external representation of the p oint The size

declaring name to b e a synonym for the type t which is

of the record type is computed as follows

either a base type or one of the ab ove constructed types A

type declaration for an IDL type t gives rise to the denition

sizeofPoint Int

of the following Haskell declarations

sizeofPoint structSize sizeofintsizeofint

A Haskell type declaration for the Haskell type

where structSize is a platform sp ecic function that

3

N name such that T name N name

computes the size of a struct given the eld sizes

Points two elds are marshalled into the external rep

marshallN name T name IO B t which

resentation of Point by calling the byreference mar

implements the M scheme for converting from the

shaller for the basic type Int supplying a p ointer that

Haskell representation T t to the IDL type t

has b een appropriately oset

unmarshallN name B t IO

For the union type example given at the start of Sec

T name which implements the dual U scheme for

tion the following Haskell type is generated

unmarshalling

marshallN name At Ptr B t T name

data Floats F Float D Double

IO for p erforming byreference marshalling of

the constructed type

together with actions for marshalling b etween the al

gebraic type and a union omitting the type signatures

unmarshallN name At Ptr B t

for the byreference marshallers

IO T name which implements the R scheme for

unmarshalling a constructed type byreference

marshallFloats Floats IO Ptr Floats

unmarshallFloats Ptr Floats IO Floats

sizeofN name Int a constant holding the size

of the external representation of the type in bit

The external representation of a union is normally a

bytes

struct containing the discriminant and enough ro om

to accommo date the largest member of the union In

The general rules for converting type declarations into

the case of Floats the external representation must

Haskell types is presented in Figure Here is what they

b e large enough to contain and int and a double

generate when applied

Enumerations have a direct Haskell equivalent as alge

In the case of a type declaration for a base type this

braic data types with nullary constructors For exam

merely denes a type synonym For example

ple the RGB declaration

3

Similarly a function that returns the osets at which to marshal

typedef int year

each eld into is also provided Due to lack of space marshallPoint

makes the simplifying assumption that structures contain no internal

is translated into the type synonym padding

D typedef t name

type N name T t

marshallN name marshallT t

marshallN name At marshallT t At

unmarshallN name unmarshallT t

unmarshallN name At unmarshallT t At

sizeofN name S t

D typedef t name dim

type N name T t

marshallN name marshallArray dim marshallT t At

marshallN name At marshallArrayAt dim marshallT t At

unmarshallN name unmarshallArray dim unmarshallT t At

unmarshallN name At unmarshallArrayAt dim unmarshallT t At

sizeofN name dim S t

D typedef struct tag f t eld g name

data N name N tag f N field T t g

i i

marshallN name rec do

ptr hdAlloc S name

marshallN name At ptr rec

return ptr

marshallN name At ptr N tag f N field g do

i

let ptr addPtr ptr

let ptr addPtr ptr S t

i

i i

W t ptr field

i i

i

return

unmarshallN name unmarshallN name At

unmarshallN name At ptr do

let ptr addPtr ptr

let ptr addPtr ptr S t

i

i i

N field R t ptr

i i

i

return N tag N field

i

sizeofN name structSize S field

i

D typedef enum falt value g name

data N name N alt

marshallN name x

case x of f N alt N value g

unmarshallN name x

case x of f N value return N alt g

unmarshallN name At ptr do

v hdReadInt ptr

unmarshallN name v

sizeofN name sizeofint

Figure Translating declarations

typedef enum redgreenblue RGB COM metho ds are invoked indirectly through a vector

table To supp ort this the Haskell foreign declaration

is translated into the Haskell type

has to b e extended to allow indirect calls For example

the HaskelltoCOM side lo oks like this

data RGB Red Green Blue

foreign import stdcall

with concrete representation B RGB Int

dynamic primFoo Addr

The marshalling actions simply map b etween the

nullary constructors and Int

The keyword dynamic replaces the static name of the

foreign function and the address of the function is

marshallRGB RGB IO Int

instead passed as the rst argument to primFoo The

marshallRGB nm

foreign export case is similar

return case Red Green Blue

Lastly there are several design choices concerning

unmarshallRGB Int IO RGB

what the programmer has to write to implement a

unmarshallRGB v

COM ob ject Do es she write a collection of functions

case v of

that take the ob ject state as their rst argument Or

return Red

do es she write a single function that returns a record

return Green

of all the metho ds of the ob ject

return Blue

fail userError

Status and conclusions

The inverse mapping

HDirect is now our fourth attempt at a foreignlanguage in

terface for Haskell The rst was ccall a limited and low

Once marshalling and unmarshalling functions are dened

level extension roughly equivalent to foreign import

for each data type it is not hard to reverse the mapping and

The second was Green Card which gradually turned into

build co de that allows C to call Haskell The translation for

a domainsp ecic language The third was a precursor

a typedef remains unchanged but the translation for an

to HDirect Red Card which was sp ecically aimed at in

IDL pro cedure declaration is reversed Since the pro cedure

terfacing Haskell to COM ob jects HDirect embo d

is b eing implemented in Haskell its inparameters are

ies the lessons we have learned strive for implementation

unmarshalled the Haskell pro cedure is called its results

indep endence avoid inventing new languages the customer

are marshalled and returned to the caller We omit the

is always right

details but the translation rule can b e expressed just as we

We do not claim great originality for these observations

did in Section For example the Move IDL declaration

What is new in this pap er is a much more precise de

of that Section would b e compiled to the following Haskell

scription of the mapping b etween Haskell and IDL than

co de

is usually given This precision has exp osed details of the

mapping that would otherwise quite likely have b een mis

foreign export stdcall Move

implemented Indeed the sp ecication of how p ointers are

primMove Ptr Point IO

translated exp osed a bug in our current implementation of

HDirect It also allows us automatically to supp ort nested

primMove a

structures and other relatively complicated types without

do p unmarshallPoint a

great diculty These asp ects often go unimplemented in

q move p

other foreignlanguage interfaces

marshallPointAt a q

return

We are well advanced on an implementation of HDirect

We can parse and typecheck the whole of Microsoft IDL

and can generate stubs that allow Haskell to call C and

move Point IO Point

COM We have not yet implemented the reverse mappping

move error Not yet implemented

but we exp ect to do so in the next few months

The foreign export declaration asks the Haskell compiler

to make Move externally callable with a stdcall interface

Acknowledgements

primMove do es the marshalling b efore calling move which

should b e provided by the programmer

We thank Conal Elliott for playing the vital role of Friendly

We are also interested in allowing Haskell programs to create

Customer much of our motivation derives from his desired

and invoke COM ob jects and in allowing a Haskell program

applications We thank EPSRC and Microsoft for their sup

to b e sealed up inside a COM ob ject This to o is a straight

p ort b oth of equipment and manp ower Erik Meijer would

forward extension There are a couple of wrinkles however

like to thank the PacSoft group at the Oregon Graduate In

stitute for their hospitality during the nal phases of writing

COM metho ds conventionally return a value of type

this pap er

HRESULT which is used to signal exceptional condi

tions HDirect knows ab out HRESULT and reects

its exceptional values into exceptions in Haskells IO

monad

References

D Beazley SWIG and automated CC scripting

extensions Dr Dobbs Journal Februari

Sigb jorn Finne et al A primitive foreign function in

terface for Haskell In preparation preliminary sp eci

cation available from

httpwwwdcsglaacuk sofprimitivepsgz

March

Simon L Peyton Jones and Philip Wadler Imp erative

functional programming In POPL pages

Simon Peyton Jones Erik Meijer and Daan Leijen

Scripting COM comp onents from Haskell In Proceed

ings of ICSR

Simon Peyton Jones Thomas Nordin and Alastair

Reid Green Card a foreignlanguage interface for

Haskell In Proc Haskel l Workshop

Daan Leijen Red card Interfacing Haskell with COM

Masters thesis University of Amsterdam

XOp en Company Ltd XOpen Preliminary Specica

tion XOpen DCE Remote Procedure Cal l

Microsoft httpwwwmicrosoftcomjava SDK for

Java

Microsoft Press Developing for Microsoft Agent

Dale Rogerson Inside COM Microsoft Press

Jon Siegel CORBA Fundamentals and Programming

John Wiley Sons

A Vogel B Gray and K Duddy Understanding any

IDL lesson one DCE and CORBA In Proceedings of

SDNE

A Vogel and B Grey Translating DCE IDL in OMG

IDL and vice versa Technical Rep ort CRC for Dis

tributed Systems Technology Brisbane