Scheme in The Real World: A Case Study

Michael Bridgen, Noel Welsh, Matthias Radestock LShift Hoxton Point 6 Rufus Street London, N1 6PE, UK {mikeb, noel, matthias}@lshift.net

ABSTRACT that we have in-house experience with; Java has plenty of ready- Scheme is the core technology in the back-end of the New Media made libraries available for standard tasks, like database connec- Knowledge web site developed by LShift. This case study exam- tivity and logging; Java application servers are well understood by ines how we integrated Scheme into a commercial web develop- support staff. Using Java gives us a wide choice of tools; on the ment environment, and how we used unique features of Scheme to other hand, most of these tools are more complicated than we re- simplify and speed up development. ally need. We are effectively trading some extra configuration and glue-code work for a widely familiar, pre-fabricated platform. 1. INTRODUCTION New Media Knowledge (NMK, [1]) is a business resource for indi- 2.1 Presentation viduals and companies working in interactive digital media. NMK The presentation layer uses Cocoon and XSLT. We chose XSLT has recently acquired a grant to redevelop its web site. Lateral (a as we have found it is usually close enough to HTML to be easily design company) and LShift combined to deliver the work, with modified by designers. Additionally, we’ve found XSLT signifi- Lateral creating the interface and LShift doing the back-end work. cantly faster and more enjoyable to develop with than other Java- This case study details the technology LShift used to implement the based web presentation layers (e.g. JSP [8], FreeMarker [9]). website. In particular, we examine how we used Scheme to manage complex user interaction via the World Wide Web. Cocoon is used to map URLs to actions. Cocoon uses a configu- ration document describing ’pipelines’ that pair request criteria (a 2. OVERVIEW pattern for the URL, for example) with the procedure to generate The main functions of the NMK web site are: the response through a sequence of actions and transformations. In the case of NMK, the most common steps are Scheme actions and XSL transformations. The Cocoon pipelines are kept very short • Provide a calendar of events that NMK run, and a means for at one or two stages maximum. This is a deliberate design deci- users to register for them; and sion; we have found programming with long Cocoon pipelines to be tedious and error prone: XSLT has poor abstraction facilities • Support a user community with contributed content. and long chains of XSLT transformations require maintaining too many complex contracts. In the NMK system, the input to XSLT is always XML representing a data object and the output is always a The website is implemented as a standard three-tier system (see view of that object, usually HTML. figure 1). The main technologies used in each tier are: There can be performance problems with XSLT but this is not ex- • presentation: Cocoon [2], Castor [3] and XSLT [4] pected to be an issue with the load the NMK website experiences. Cocoon provides a fairly extensive infrastructure for caching in- • business logic: SISC [19], JavaBeans [5], Object-Relational puts, stylesheets and results, and component pooling, which help Bridge (OJB) [6] in any case.

• database: SQL Server [7] The majority of the business logic is written in Scheme and runs in the SISC Java-based Scheme interpreter. This case study con- centrates on the business logic, which is where the most interesting The basis of the system is Java. This choice was a function of a action takes place. number of factors: There are a number of tools available for Java 2.2 From database to XML There are two tasks we have to take care of before we can write the business logic:

• Getting data out of the database; and

• Transforming output into XML so the XSLT transformations can work on it. the to load Each The figuration, for Ja relational only The us, OJB Serv ments the collection dard ate Castor mises with T Castor same grate to ticulars 3. limit Holding ransforming v map add go a Ja though Ja underlying Apache er choice tools Ja and a a also our v task Scheme v SCHEME Ja this on [12], aBeans JDBC fe aBeans. v fields generates requires Ja license v of aBeans. SCHEME WORKFLOW w e aBeans functionality sa v data v mapping tak is that the aBeans in classes). b this erything operations v of as ut Object-Relational tedious e es that dri f with Scheme types, a data that data. once do well SQL causes database. v a care OJB a v our v Most we Ja er are ailable). the bit to request userinput are the into Ja v as completed [11]. together All to Serv aBean populate of of b W receive userinput database pro IN v not of job . ut we store andretrieveJavaBeans compatible a complications of the b code, do e filling the XML usiness and vides other up-front thereafter the er manipulate for reflected CONTEXT need. W manually code source simpler is e Through Cocoon. configuration it is us. could, is from tables. code a arbitrary collections is k Bridge we Scheme. logic to JDO done necessary cont. store configuration with only marshal cont. retrieve is in code . deals the OJB-specific in Luckily is as in and It [10] the using the (confusingly visible, OJB insulated principle, interact database. requires we (NMK Figure from Before database solely e of use interf is v the will to entually (e.g., the , to dependent there discuss XML Ja of as 1: get already with and ace; see Castor discussing a with from v it use to Ov bit API Castor aBeans the , so Castor are later generates “OJB”) some use Schema a we ervie transform of manipulating ho this an the OJB small as ready-made frame had initial objects . y the w decided COCOON allo details w there to to database does compro- we the correct object- is a of API gener XML. ws w docu- stan- SQL used inte- con- ork. par the not the are for JavaBean us of to to - - components ( Ja In ( ( ;; is is ar ;; mechanism: SISC When is ( Ja as pre tion functions 3.1 this with b b define method-one bean bean e e guments, and define v v equi stored equi ( a a normal SISC, vious a a g n n of with eneric-java-pr classes submit form methods a . . in OJB has no ’( ’( fers v v we g s list The alent alent method-one field1 field1 e the e

w CASTOR a into calls. Ja t Ja t to a call generic generic F F benefits call of v not v comprehensi NMK i can i call retrieve rows aBeans a to to obj e e the SISC ) symbols, a field2 are l l store and methods So it value just d d the the to XML be bean ) on 1 1 bean. procedure website. functions incorporated ocedur ( ( instantiated Ja Ja we the the v ) is ) . v v . a with F . slightly a a l g . can object So u or this )) those are code code e v e e t e F a ) eign ’ e Ja | ; dispatched i ar xploit. list in methodOne for e v and symbols gument. l dif a and SISC’ d within of a F 2 ferent, Function successi oreign Ja symbols ( methods XSLT v In ) s a . object . method short, SISC’ This are on DB . though Function to HTML | v ; )) the e interpreted and on means objects s system. if runtime generic ... those Interface as an a v we Interf object alue, that returned classes will The as procedures type we ace that is accessor see inte can called called (FFI).

of USER AGENT v from later alue gra- all do some things in Scheme with Java methods that are much harder application variables, like database connections, and deployment- to accomplish in Java alone. For access control, for instance, we specific file locations. All that is needed are a few wrappers to make implement a class that has a static method for each of (predicate× it convenient in Scheme: class), where the predicates are like (define initial-ctx (make )) s t a t i boolean i s E d i t o r ( A r t i c l e o , User u ) { . . . } (define local-ctx (lookup initial-ctx (->jstring "java:comp/env"))) { } s t a t i c boolean i s E d i t o r ( Meeting o , User u ) . . . (define (lookup-context-var name) (->string (lookup local-ctx (->jstring name)))) If we declare isEditor as a generic procedure, then using it with different objects will ’just work’. In contrast, in the Java code that (define ∗SECRET-KEY∗ (lookup-context-var "secretKey")) prepares content for indexing there are overloaded methods for cre- ating index entries, but we have to do the dispatching ourselves. Java overloading uses the static type of the arguments, so to encode 3.2 Transactions and the Back Button this pattern in Java requires a chain of instanceof tests1. A core part of the NMK website is presenting data to the user and allowing them to edit it. The editing cycle is basically structured p u b l i c Document createDocument ( A r t i c l e o ) { as edit ↔ preview → commit. Within each cycle we want multiple . . . browsers and the back button to work as expected. For a full de- } scription of the problem and general solution, see [20], and more p u b l i c Document createDocument ( Meeting o ) { . . . recently [18]. There are a few issues here to get this all working } correctly. p u b l i c Document createDocument ( User o ) { . . . } 3.2.1 Scheme, interrupted Naturally we use so we can structure the program p u b l i c Document createDocument ( I n d e x a b l e o ) { code clearly despite the inverted control of HTTP. We hide the con- return tinuations behind a channel abstraction. The channel API boils ( o i n s t a n c e o f A r t i c l e ) ? down to: createDocument ( ( A r t i c l e ) o ) : ( o i n s t a n c e o f Meeting ) ? ;; channel × symbol × bean → symbol × input createDocument ( ( Meeting ) o ) : ;; ( o i n s t a n c e o f User ) ? createDocument ( ( User ) o ) : ;; Go get some data from the user n u l l ; (channel-call channel action data) } ;; Send some data to the user (never returns) The generic procedure system in SISC is similar in feeling to that (channel-send channel action data) in Dylan [13] and CLOS [14]. The function channel-call provides the main abstraction for com- SISC also supports defining classes that implement Java interfaces, municating with the user. It stuffs away the in the by using the Java reflection API to create dynamic proxies. This is user’s HTTP session, and hands the (probably incomplete) Jav- especially useful for creating event listeners, as these usually end aBean to Cocoon along with a stylesheet to use (the action). up being anonymous classes. We can use a proxy with a closure as Cocoon serialises the JavaBean and runs it through the stylesheet. a listener to achieve the same effect; for example, using the Swing Usually the stylesheet produces a form, populated with data from Timer class to schedule re-indexing: the XML-ised JavaBean; submitting the form picks up the contin- (define-java-proxy (timer-task fn) uation from the HTTP session and takes up where it left off. This () corresponds to the ’shortcut’ in figure 1. We number the continua- (define (action-performed proxy event) tions per user and use the number as part of the URL for subsequent (fn event))) requests, so the user can branch at or revisit any point. This is an effective way of cloning similar content, for instance. (start (java-new (->jint 3600000) Scheme continuations get stored in the user session of the J2EE (timer-task [15] application server. Persistence is accomplished by simply (lambda (proxy event) storing the serialised continuations in a database; many applica- (re-index))))) tion servers deal with session migration by serialising the sessions and restoring them on the migration target. This means SISC web 3.1.1 SISC and J2EE applications can be made resilient and load-balanced by simply configuring the appropriate session persistence/migration features To provide a persistent environment for the interpreter, we use the of the application server. We have run experiments with Tomcat SISC servlet. The SISC servlet, as a standard feature, runs a REPL [16] involving restarting of the application server in the middle of on a nominated port. Being able to run arbitrary code in the same a workflow, and it all works beautifully. context as the application is extremely handy for debugging and incrementally developing code in a running system. Using continuations in this way we are effectively righting the in- verted interaction that HTTP tries to force on us. The usual model The J2EE application server includes an application context that is to wait for the user to ask for something, figure out what they can be configured at deployment time. This is useful for storing were trying to accomplish and provide an appropriate response. In 1or a modified virtual machine - see [17] our righted model, when we need data from the user, we just ask for it - at least from the Scheme programmer’s point of view. The fol- (define (meeting-clone meeting) lowing pseudo-code shows the advantages of this system - here we (let ([meeting (clone meeting)]) display some data to the user and save it following their approval: (meeting ’(meeting-file-as-reference) (let-values (((action input) (channel-call channel ’show data))) (clone (meeting ’(meeting-file-as-reference)))) (save data)) meeting))

We do not need to do any parsing of URLs or form parameters to 3.3 Code structure figure out what state the computation is at. We can divide the interesting Scheme code into three types of func- tion: There are some implications of storing the continuation in the user session that have to be worked around: • action - these do the interaction with the user, and dispatch to other actions or return depending on the result; and • The continuation is only available while the user session lasts. We want continuations to expire at some point, and • population - these take the user input and fill in the fields of the typical session lifetime of around 30 minutes is reason- a JavaBean. able. We keep workflow steps short to avoid the situation in which the user’s session times out while they are filling in a • workflow - these fulfill the contract with Cocoon, usually by form. calling an action; • When a user session is serialised, the first continuation is ex- pensive as the closure captures all the lexical environments 3.3.1 Actions: There and Back Again up the call-chain minus the top one; subsequent continua- Actions perform interaction with the user. All actions have the tions to be serialised are relatively cheap, as they will share same type: references. SISC only serialises the names of top-level defi- nitions, so there is no chance of inadvertently serialising the bean × environment × channel → bean entire state of the application. This enables us to compose actions as simply as making a function • User sessions are isolated - we must avoid mutating call. A simple example is: application-level objects, so that users cannot cannot have different application states captured in their stored continua- (define (meeting-preview meeting env channel) tions. (with-input-from-channel (channel ’preview meeting => input) [ok meeting] 3.2.2 SISC and Cocoon [edit (meeting-edit meeting env channel)] The Scheme code is invoked by Cocoon, using a small class [cancel #f])) (SchemeAction) that implements one of the Cocoon plug-in inter- faces. SchemeAction uses a SISC interpreter to evaluate a specified function. In the Cocoon configuration, for example, we declare a We have defined a macro, with-input-from-channel, that encapsu- workflow for adding an article: lates the common pattern of using channel-call - get some data from the user, then dispatch to another action or return based on < > map:match p a t t e r n =” a r t i c l e / add / ∗ ” the button they pushed. Notice the call to meeting-edit, which hides an interaction with the user. We use this facility a lot to go on ’excursions’: reusable, atomic interactions such as uploading a file. We can make these interactions as complex as we need and control will return to where we left off. In effect we have function call semantics just like in a The pattern attribute maps a URL starting with article/add to normal program. workflow. Whatever appears where the wildcard is, is passed to Scheme as a ’continuation number’ (kont)so it can retrieve the 3.3.2 Population continuation from the user’s session. src parameter corresponds Filling a JavaBean with data from the database or a web form is to the Scheme procedure article-add-start. a very common operation. For the former we use OJB and a few wrappers in Scheme for querying, updating and deleting objects. 3.2.3 Avoid Mutation All database actions are done at the beginning or end of the work- Continuations make supporting multiple browsers and the back but- flow, to keep it transactional. ton easy, but we must be careful how we manage state. If we mutate values that could be captured by a continuation the changes will (define (article-add-start env channel input) persist when the user moves backward. However our infrastructure (let ([article (make )]) relies on getting and setting properties in JavaBeans, an inherently (article ’(group) (group-info-by-id (input-parameter input ’id))) stateful operation. Hence we always clone any JavaBeans we are (let ([article (article-add article env channel)]) using, and only update values in the clones. (if article (begin This also applies to collections of dependent objects. Unfortu- (create-object article) nately, Java cloning operations are shallow so we have to clone (channel-send channel ’confirm-add article)) collections ourselves: (channel-send channel ’cancel jnull))))) Here, the call to article-add populates the JavaBean, asking the (define (string->jtime s) user for more information in the process; once it has a result, it (if (string-null? s) either saves the JavaBean to the database and sends a confirmation (java-null ) message to the user, or sends a cancel message. (let ([res (make )] [c (get-instance )]) We can use some features of Scheme and the SISC FFI to help us (c ’(time) (parse time-format (->jstring s))) out with populating JavaBeans from user input. There are three (for-each (lambda (set-field get-field) main steps: (res (list set-field) (->jshort (->number < > • Extract data from the request parameters (jget c ( Calendar get-field)))))) ’(hour minute second) • Convert and validate the data if necessary ’(|HOUR OF DAY| |MINUTE| |SECOND|)) (set-utc res) • Populate the JavaBean with converted data res)))

A few simple utilities suffice: 3.3.3 Error handling and reporting ;; (string → any) × (list-of symbol) × input We distinguish between application errors and workflow errors. ;; → (list-of any) The former are run-time errors, and percolate through to the nor- ;; mal error mechanisms through SISC to Java to be reported by Co- ;; Given a list of field names, returns a list of the inputs with coon. The latter are errors in the usage of the application; e.g., a ;; those names, converted by the conversion function. user trying to edit something they are not permitted to edit. Work- (define (input-parameters/conversion conversion fields input) flow errors are wrapped in a JavaBean and sent through the normal (map Cocoon channel. We use Java resource bundles to localise the error (lambda (field) messages, and the resource file to use is stored in an application (conversion (input-parameter input field))) variable so it can be set at deployment time. fields)) ;; bean × (list-of symbol) × (list-of (union any #f))) → bean 3.3.4 Don’t Repeat Yourself ;; A side-effect of using Scheme alongside other languages is that it ;; Populate the given fields of the bean with the given converted is easy to end up with duplicated procedures. ;; values (define (fill-object obj fields values) Our strategy for avoiding duplication with Java was to write the (for-each (lambda (field value) common part in Java and refer to it in Scheme. For example, Co- (if value coon is configured to map URL patterns corresponding to work- (obj field value))) flows to Scheme actions, and other URLs to Java classes that query (map list fields) for JavaBeans directly. This replicates some query-building re- values) quirements across Scheme and Java; we have minimised this by obj) putting common query fragments into static methods of a utility class. We use a class QueryHelper to get query criteria in Java and With these two utilities all we need to supply is the list of fields for Scheme: each JavaBean, and the conversion functions to apply. C r i t e r i a c r i t = (define (meeting-edit-fill meeting username input) QueryHelper . makeDateAndTagCrit er ia ( (define text-fields t h i s . d a t e F i e l d , t h i s . t a g F i e l d , ’(title abstract recurrence venue-title venue-description t h i s . date , speakers sponsor-introduction |reportURL|)) t h i s . t a g ) ; (define markup-fields ’(body)) OjbQueryImpl query = (define time-fields ’(start-time finish-time)) new OjbQueryImpl ( e x t e n t C l a s s , c r i t ) ; . . . return query ; (let ([meeting (meeting-clone meeting)]) (meeting ’(modified) (make )) (define (object-by-date-and-tag (fill-meeting meeting ->jstring text-fields input) type (fill-meeting meeting string->markup markup-fields input) date-field (fill-meeting meeting (safe-conversion string->jtime) tag-field time-fields date input) tag) . . . (object-by-criteria (meeting ’(modified) (make )) type meeting)) (make-date-and-tag-criteria Converting between input strings and XML Schema/Castor, Java (->jstring date-field) and JDBC types occasionally proves troublesome. In particular, ex- (->jstring tag-field) pressing times (of the day) in each has its own peculiarities, leading (to-date (string->jdate date)) to some circuitous functions (->jstring tag)))) Another place that code can be duplicated is in the symmetry be- (let ([proc (deep-assq (list (user-role user) tween presenting actions and restricting actions; we do not want (article-status article) to give the user choices that they cannot follow through on. In the action) NMK system, the presentation is done with XSLT while the re- preview-actions)]) striction is computed in Scheme. Luckily, our channel abstraction (and proc lets us pass parameters to the XSLT, so we can calculate both in (proc article env channel))) Scheme. Further, we can do it using the same, inlined data struc- ture. 4. CONVENTIONS (define (article-preview-actions edit-func preview-func) Applying consistent conventions throughout the code base makes (define (article-change-status-helper status) it much easier to understand the code. Some of the conventions we . . . ) used include: (define (article-edit-helper) . . . ) A noun-verb naming scheme, and standard verbs (add, edit, re- (let∗ ([administrator-and-editor-pending-or-wip move). The code uses consistent names like meeting-edit and ‘((approve . ,(article-change-status-helper ’live)) article-preview, and the URLs to access these functions are simply (comment . ,(article-change-status-helper ’wip)) /meeting//edit and /article//preview re- (later . ,(article-change-status-helper ’pending)) spectively. In addition to the readability benefits this provides a (reject . ,(article-change-status-helper ’hidden)) weak form of namespaces, as we do not use the SISC module sys- (edit . ,(article-edit-helper)) tem. (cancel . ,(lambda args #f)))] [administrator-and-editor-actions A common pattern for initiating actions. Each workflow begins as a ‘((pending . ,administrator-and-editor-pending-or-wip) function named -start which initialises the JavaBean and (wip . ,administrator-and-editor-pending-or-wip) then calls the actual action (named using our noun-verb scheme). (live We leverage this convention to reduce the integration with Cocoon (edit . ,(article-edit-helper)) to supplying a list of action names, which we process with a macro. (approve . ,(article-change-status-helper ’live)) (save . ,(article-change-status-helper ’wip)) 5. CONCLUSION (remove . ,(article-change-status-helper ’hidden)) Our experience with Scheme has been very positive and the NMK (pull . ,(article-change-status-helper ’pending)) back-end is known as one of the cleanest and best designed projects (cancel . ,(lambda args #f))))] at LShift. We put our success down to two points: [user-pending-or-wip ‘((save . ,(article-change-status-helper ’wip)) Using existing Java frameworks where possible. We do not write (submit . ,(article-change-status-helper ’pending)) any code to deal with the drudge work of connecting to a database (edit . ,(article-edit-helper)) or generating HTML. Interfacing with Java is occasionally painful (cancel . ,(lambda args #f)))]) (using the date/time API, for example) but in general it is more than ‘((administrator . ,administrator-and-editor-actions) worth it. Additionally we can express the user interface as XSLT, (editor . ,administrator-and-editor-actions) which web designers are already familiar with, by integrating with (author (pending . ,user-pending-or-wip) Cocoon. (wip . ,user-pending-or-wip)))) The features of the Scheme language. Continuations allow us to Beyond all the quasi-quoting, it is simply a structure with the al- work around HTTP’s inverted control, and write code in a natural lowed actions per user type and the resulting function to call. With style. Higher-order functions and the SISC FFI allow us to write this data structure in hand, we can ask what things is the user al- very expressive, compact code. Writing functions to fit some stan- lowed to do: dard contracts means we can compose units of user interaction into complex workflows. (define (deep-assq keys alist) (and (pair? alist) ( ([key-car (car keys)] 6. REFERENCES let [1] New Media Knowledge Website, http://www.nmk.co.uk/. [key-cdr (cdr keys)]) (cond [(assq key-car alist) [2] Cocoon website, http://xml.apache.org/cocoon2/. => (lambda (val) (if (null? key-cdr) [3] Castor website, http://castor.exolab.org/. (cdr val) (deep-assq key-cdr (cdr val))))] [4] XSLT Recommendation, http://www.w3.org/TR/xslt. [ #f])))) else [5] The JavaBeans specification is available from http://java.sun.com/products/ejb/docs.html. (define (available-actions user) (let ([actions [6] OJB website, http://db.apache.org/ojb/. (deep-assq (list (user-role user) (article-status article)) preview-actions)]) [7] Microsoft SQL Server product information, (and actions (map car actions)))) http://www.microsoft.com/sql/. [8] The JSP specification is available from ... and dispatch to the appropriate function: http://java.sun.com/products/jsp/download.html#specs. [9] Freemarker website, http://freemarker.sourceforge.net/. [10] The JDO specification is available from http://www.jcp.org/aboutJava/communityprocess/first/jsr012/. [11] The JDBC specification is available from http://java.sun.com/products/jdbc/download.html. [12] Information about XML Schema can be found at http://www.w3.org/XML/Schema. [13] Detail at http://www.gwydiondylan.org/drm/drm 48.htm#HEADING48- 0. [14] Detail at http://www.lisp.org/table/references.htm#ansi. [15] Information on Java 2 Enterprise Edition at http://java.sun.com/j2ee/docs.html. [16] Tomcat website, http://jakarta.apache.org/tomcat/index.html. [17] C. Dutchyn. Multi-dispatch in the java virtual machine: Design and implementation, 2001. [18] P. Graunke, R. Findler, S. Krishnamurthi, and M. Felleisen. Modeling web interactions, 2003. [19] S. G. Miller. SISC Scheme Interpreter website, http://sisc.sourceforge.net. [20] C. Queinnec. The influence of browsers on evaluators or, continuations to program Web servers. ACM SIGPLAN Notices, 35(9):23–33, 2000.