Templates and friends

Course " Language Engineering"

University of Koblenz-Landau Department of Computer Science Ralf Lämmel Software Languages Team

© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle 1 What’s a template?

© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle 2 “Hello World” with StringTemplate in

String template with one parameter import org.stringtemplate.v4.*; ... ST hello = new ST("Hello, "); hello.add("name", "World"); System.out.println(hello.render());

[http://www.antlr.org/wiki/display/ST4/Introduction, 10 December 2012]

© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle 3 “Hello World” with StringTemplate in #

using Antlr4.StringTemplate;

Template hello = new Template("Hello, "); hello.Add("name", "World"); Console.Out.WriteLine(hello.Render());

[http://www.antlr.org/wiki/display/ST4/Introduction, 10 December 2012]

© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle 4 “Hello World” with StringTemplate in Python

import stringtemplate4 hello = stringtemplate4.ST("Hello, ") hello["name"] = "World" print str(hello.render())

[http://www.antlr.org/wiki/display/ST4/Introduction, 10 December 2012]

© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle 5 Quick summary

Exercised concepts A template is a declarative, Literal text parametrized, and executable Place holder for text description of a text generator. Other concepts Conditionals, loops over template “context” Expression holes for computing output ... Find a proper definition that fits into the context of this lecture.

© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle 6 Motivation: templates in

© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle 7 Web programming with PHP

HelloWorld WebApp HelloHello World!World!

';'; ??>

HTML with embedded PHP Resulting HTML executed on the web server rendered in the web browser

© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle 8 Web programming with Rails

© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle 9 Web programming with Rails

101companies Ruby on Rails Web App

<%= notice %>

Name: <%= @company.name %>

Departments:

<% @company.departments.each do |department| %>

<%= link_to department.name, department_path(department) %>

<% end %>
<%= link_to 'Edit', edit_company_path(@company) %> | <%= link_to 'Back', companies_path %>
[https://github.com/101companies/101repo/blob/master/contributions/rubyonrails/companies/app/views/companies/show.html.erb]

© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle 10 Web programming with Struts/JSP

...

Use of code (not shown here) and tag libraries (see prefix s). ... https://github.com/101companies/101repo/blob/master/contributions/strutsAnnotation/src/main/webapp/WEB-INF/content/company-detail.jsp

© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle 11 “Model to Text” with JET

<%@ jet package="generator.website" class="Simple2HTML" imports ="java.util.Iterator org.eclipse.emf.common.util.EList datamodel.website.*;" %> <% Webpage website = (Webpage) argument; %> <%=website.getTitle()%> Similar to JSP. Works on top of EMF.

List of Articles Names

<%EList
list = website.getArticles(); for (Iterator
iterator = list.iterator(); iterator.hasNext();) { Article article = iterator.next();%> <%=article.getName()%>
Generative (MDE) approach such that <%}%> a designated class is generated from the template.

http://www.vogella.com/articles/EclipseJET/article.html

© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle 12 “Model to Text” with JET import generator.website.Simple2HTML; Import generated class. import java.io.BufferedWriter; import java.io.FileWriter; import java.io.IOException; import java.util.Iterator; import datamodel.website.MyWeb; import datamodel.website.Webpage; Import EMF-based model classes. public class SimpleWebTest { private static String TARGETDIR = "./output/";

public static void main(String[] args) { EMFModelLoad loader = new EMFModelLoad(); MyWeb myWeb = loader.load(); createPages(myWeb); }

private static void createPages(MyWeb myWeb) { ... } http://www.vogella.com/articles/EclipseJET/article.html

© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle 13 “Model to Text” with JET

... private static void createPages(MyWeb myWeb) { Simple2HTML htmlpage = new Simple2HTML(); FileWriter output; BufferedWriter writer; for (Iterator iterator = myWeb.getPages().iterator(); iterator .hasNext();) { Webpage page = iterator.next(); System.out.println("Creating " + page.getName()); try { output = new FileWriter(TARGETDIR + page.getName() + ".html"); writer = new BufferedWriter(output); writer.write(htmlpage.generate(page)); writer.close(); } catch (IOException e) { e.printStackTrace(); Actual template processing. } } } }

http://www.vogella.com/articles/EclipseJET/article.html

© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle 14 Basic concepts

© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle 15 @

A template processor (also known as a template engine, template parser or template language) is software or a software component that is designed to combine one or more templates with a to produce one or more result documents.

[http://en.wikipedia.org/wiki/Template_processor, 10 December 2012]

© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle 16 Template engine @ Wikipedia

A (web) template engine is software that is designed to process web templates and content information to produce output web documents. It runs in the context of a template system.

[http://en.wikipedia.org/wiki/Template_engine_(web), 10 December 2012]

© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle 17 Actual template engines @ Wikipedia

• Genshi • Mako (template A (templating engine) • language) • Microsoft • ASP.NET H ASP.NET Razor • • TinyButStrong C • view engine • Toupl • CakePHP • Handlebars • Mustache (template system) (template system) • (template • CCAPS engine) J N • CheetahTemplate V • CodeCharge • JavaServer • Nevow • Text Template Studio Pages O Transformation • CTPP • Jinja (template • Open Power Toolkit engine) Template • VlibTemplate • JSP Weaver P • Dylan Server W Pages K • PHPRunner • Web template E • Kid (templating S system language) • ERuby • • WebMacro F L T • FreeMarker • Lithium (PHP • Template Attribute G framework) Language M • Template Toolkit

[http://en.wikipedia.org/wiki/Category:Template_engines, 10 December 2012]

© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle 18 Template system @ Wikipedia

A is composed of: • A template engine: the primary processing element of the system; • Content resource: any of various kinds of input data streams, such as from a relational , XMLfiles, LDAP directory, and other kinds of local or networked data; • Template resource: web templates specified according to a template language;

[http://en.wikipedia.org/wiki/Web_template_system, 10 December 2012]

© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle 19 Template @ Wikipedia

Wikipedia does not define a general concept of template. It does define the notion of a web template:

“A web template is a tool used to separate content from presentation in , and for mass-production of web documents. It is a basic component of a web template system.Web templates can be used to set up any type of website. In its simplest sense, a web template operates similarly to a form letter for use in setting up a website.”

[http://en.wikipedia.org/wiki/Web_template, 10 December 2012]

© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle 20 Template and friends

Macros Compile-time meta-programming Run-time meta-programming Staged computation Multi-stage programming

© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle 21 @ Wikipedia

A macro (from the Greek μακρό for "big" or "far") in computer science is a rule or patternthat specifies how a certain input sequence (often a sequence of characters) should be mapped to a replacement input sequence (also often a sequence of characters) according to a defined procedure. The mapping process that instantiates (transforms) a macro use into a specific sequence is known as macro expansion. A facility for writing macros may be provided as part of a software application or as a part of a .

[...]

Macro systems that work at the level of abstract syntax trees are called syntactic macros and preserve the lexical structure of the original program.

[http://en.wikipedia.org/wiki/Macro_(computer_science), 10 December 2012]

© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle 22 These definitions are not very general (abstract) and not helpful to demarcate templates and related concepts. Become a Wikipedia contributor!

© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle 23 Relevant expressiveness

Arguments (context) of templates. Text (hopefully syntax in the target language) Conditionals over context of template. Loops over context of templates. Declarations, statements, expressions in host language. ...

© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle 24 Motivation (cont’d)

© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle 25 Generating Java with StringTemplate for Java

StringTemplate uses ANTLR. ANTLR uses StringTemplate.

[http://www.antlr.org/wiki/display/ST4/Introduction, 10 December 2012]

© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle 26 Macros in Scheme

An extended “if” with keywords for “then” and “else”. ;;define my-if as a macro (define-syntax my-if (lambda (x) ;;establish that "then" and "else" are keywords (syntax-case x (then else) ( ;;pattern to match (my-if condition then yes-result else no-result) ;;transformer (syntax (if condition yes-result no-result)) ) )))

[http://www.ibm.com/developerworks/linux/library/l-metaprog2/index.html]

© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle 27 A macro is pretty much like a template except that it is integrated into /compile time of language processing and emits code (ASTs) for the same language.

Find a proper definition that fits into the context of this lecture.

© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle 28 often have complex enabling Boolean equations over these 2 aspects, which encode configuration dependencies, and Many programming language environments provide complex nesting to allow code sharing between “” facilities, allowing considerable automated configurations. “editing” of a source file “before” it is seen by the language The term “preprocessor” comes from the fact that these processor. Preprocessors with conditionals are often used facilities have traditionally been grafted onto languages as as compile-time software configuration management tools, an afterthought, and consequently implemented as a with preprocessor “variables” naming configuration “preprocessing” pass to perform the “editing”, and aspects. These aspects can be Boolean (enabling or occurring before the compilation step (most Unix systems disabling a feature) or parametricConfiguration (e.g. a integer specifying managementand many “C” compilers). For efficiency reasons, a buffersize), usually with a parameter value that by preprocessing is often now integrated into the compile step, convention disables the feature (e.g., buffersize == but the name remains. 0). with the The preprocessor provides, via special “preprocessor #ifndef Unix directives” easily found in the program text, several types #define Unix 1 of facilities that change the apparent source file: #endif • importing fixed blocks of text (C #include … char filename[132]; files) #if SDOS • inserting parameterized blocks of text (C macro short int result[256]; invocations, COBOL copybooks) int status; This code uses preprocessor #endif • selecting or deselecting existing blocks of text … (including other preprocessor directives) based on #if (SDOS|Unix)&!DEMOS somevariables configuration to dealvariable with or expression (e.g., C append_to(&filename,&extension); #if, #elsif, #else, #endif directives) #endif variability. Depending on the #if SDOS • defining blocks of text for later inclusion (C macro syscall(OS_openfile,&filename, #definevariables) set at compile OS_readonly,&status,&result,); • marking certain identifiers used as configuration if (status!=OS_no_error) (preprocessing)variables as defined time, (C different with empty syscall(OS_exit_with_error, #define &status,OS_dummy, source-codemacro) or undefined parts are selected. &status,&result); • testing the definedness status of such #elsif Unix configuration variables (C #ifdef, #ifndef, if (errno=fopen(&filename,readmode)) ) { exit(errno); } defined, undefined #else // DEMOS • combining two or more language lexemes to … create a new lexeme. This is almost always used #endif to construct a “gensym” identifier (C macro body [Ira D. Baxter, Michael Mehlich: “Preprocessor Conditional Removaloperation by Simple ## Partial). Evaluation”] Figure 1: with preprocessor directives © 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sleC, C++ and COBOL85 all provide a preprocessor29 Figure 1 shows typical configured source code. The facility defined by the language standard, based on the preprocessor variables Unix, DEMOS and SDOS are aspects language lexemes. Users of languages without native controlling which calls are made. By preprocessing facilities often fall back on the C configuration, we mean all source code defining or preprocessor (if the lexeme structures are close enough, e.g. dependent on an aspect; for SDOS, this is the conditional FORTRAN), custom preprocessors (e.g., Generic declaration of a result buffer, the filename extension- PreProcessor [6]) or on ad hoc solutions modeled after appending code, and the “then” branch of the file opening those for C, often based on a string processing tool like logic. Note that configurations often overlap because of . shared logic, such as appending the extension. Ada83 (and Ada95) were specifically designed without The configuration space managed can be enormous. N a preprocessor on the grounds that such editing can be done N Boolean configuration variables can provide for up to 2 by simply choosing, outside the scope of the language, possibly different instantiated configuration; the which source components to use. However, this artifice operating system has roughly 1000 configuration variables forces large grain source components that replicate [7]. The preprocessor conditionals found in large systems considerable content, creating the very maintenance

!"#$%%&'()*+#,+-.%+/').-.+0#"1'()+2#(,%"%($%+3(+4%5%"*%+/()'(%%"'()+6024/789:+

8;<=>?;9@8@;AB8C+D9

Find a proper definition that fits into the context of this lecture.

© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle 30 XML transformation XML literals as means of embedding XML in VB9.

Expression holes VB9

Transformation language and data language are amalgamated on the XSLT grounds of both using XML.

[http://blogs.msdn.com/b/vbteam/archive/2008/02/21/vb-xml-cookbook-recipe-1--transformations-using-xml-literals-replacing-xsl-for-each-doug-rothaus.aspx]

© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle 31 We begin to touch upon the area of language embedding (to be studied more later). In the VB9 case, VB and XML are “syntactically amalgamated”.

© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle 32 Questions

How aware of the target language should templates be? What formal guarantees to provide? (Think of type safety.) How to serve the special case of host = target language? (“macros”) How does this discuss generalize to compile-time ? ... to run-time metaprogramming?

© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle 33 Language-aware templates

StringTemplate, XPand, MOF M2T, and others are not language-aware. They basically produce any text.

© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle 34 \v), which stand for arbitrary legal C source satisfying the have defined only a “preprocessor” language, treating the syntactic category lvalue. The left hand quoted string is regions between preprocessor directives as uninterpreted the rule LHS, and the right hand quoted string is the RHS. text, but it is more economical to amortize our The occurrence of the same syntax variable multiple times reengineering investments by operating on a full C in the LHS requires that the same exact sequence occur in grammar. Such a full grammar enables other reengineering both places; this is how the rule is constrained to match activities to be implemented, such as source reformatting, identical source and target. The occurrence of the syntax clone detection, refactoring, etc. Figure 12 shows a variable in the RHS requires that the changed program subgrammar for preprocessor expressions. include what was matched for that variable on the . LHS We wish to parse “C” source before preprocessor Finally, the if condition in this rule is a language-dependent expansion occurs. However, C preprocessor directives can decision procedure that makes sure that the program occur between any pair of tokens. One can code a grammar fragment matched by \v contains no side effects, which for that, but it is extremely unwieldy and therefore would make this transformation incorrect. impractical. Before rule use, a typical rewriting engine first parses Instead, we allow preprocessor directives in the the rule according to its rule language, and then parses the grammar only at places where they commonly occur in real quoted pattern in the language to be transformed (here programs. Figures 10 and 11 show subgrammars, which specified by the default domain phrase as the “C” allow directives only at file scope and in lists of statements, language), to construct pattern trees. At transformation respectively. Allowing directives in other places in the time, the rewrite engine matches the LHS pattern tree grammar is straightforward: replicate the preprocessor against portions of the program, and replaces matched trees conditional subgrammar and instantiate for that place type. by the corresponding RHS tree if the condition is satisfied. In the actual grammar we allowed them in locals lists, and The Figure 2 rule has the effect shown in Figure 3. inside expressions and several other places. before: (*Z)[a>>2]=(*Z)[a>>2]+1; 7 Parsing Unpreproccessed C after: (*Z)[a>>2]++; There are a number of other complications that occur trying to parse C this way. We have a special preprocessor Figure 3 that delays expanding directives if it can and collects information from include files without expanding them in Typically a transformation system will have a large place. “Different” values of a #defined variable may number of rules, and a large number of possible places in a reach the end of a preprocessor conditional. Macro calls program to apply them. It is beyond the scope of this paper often ambiguously look like function calls and have to describe how the transformation system chooses whichNon-syntactic possibly use to be of resolved C preprocessing later. Discussion of how all this is rules and where to apply them. The simple notion that all handled is beyond the scope of this paper. rules possessed are applied leaf-upwards to the entire parse tree for a file is adequate for this paper, and supported directly as one mode of operation of DMS. if (. . .) { . . . DMS captures comments and number literal formats (radix, leading zeros, etc) while lexing and attaches them to #if (. . .) the trees while parsing, and is later able to reproduce the } else { source program in its original- or transformed-form from the trees with all the essential formatting information. #endif Further discussion of the DMS prettyprint capability is . . . outside the scope of this paper. Comments attached to undisturbed trees, and to the roots of rewritten trees are }; preserved, and so comments are not generally lost or Figure 4: Badly nested preprocessor directives and changed by rewriting. (One can always build special rules [Ira D. Baxter, Michael Mehlich: “Preprocessor Conditional Removallanguage by Simple Partial conditionalsEvaluation”] that change this behavior). © 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle 35 With sundry additional help in place, DMS parses 6 A Simplified C grammar typically 85% of unpreprocessed C source files in large, In this section we briefly sketch a simplified version of real source systems directly, producing ASTs containing the “C” grammar we use, because the rewrite rules are preprocessor conditionals. The unparsable balance tends to defined in terms of grammar tokens. Arguably, we could use preprocessor directives in unstructured ways. Figure 4

!"#$%%&'()*+#,+-.%+/').-.+0#"1'()+2#(,%"%($%+3(+4%5%"*%+/()'(%%"'()+6024/789:+

8;<=>?;9@8@;AB8C+D9

A Formal Way from Text to Code Templates 111 Collection of templates Parameterized template definitions

tcoll = tmpl ∗ tmpl = define tn targ ∗ tstmt ∗ enddef ! " ! " targ = ttype tvn Template arguments are typed. tstmt = text texpr expand tn texpr ∗ | ! " | ! " if texpr tstmt ∗ else tstmt ∗ endif Template statements in this order: | ! " ! " ! " for tvn in texpr tstmt ∗ endfor actual text, expressions for walking | ! " ! " tn template names context, template expansion, text text phrases conditional and loop controlled by texpr expressions Arbitrary text can be context. ttype variable types generated tvn variable names at this point . Fig. 2. Syntactical domains of TL ⊥ [Guido Wachsmuth: “A Formal Way from Text to Code Templates”] the various syntactical© 2012 domainsRalf Lämmel, Software of aLanguage language. Engineering http://softlang.wikidot.com/course:sle A dynamic semantics is described 36 in an operational style by judgements for the execution of language instances. We follow some notational conventions in this paper: For semantical domains, we use standard domain constructors, namely products (“ ”), list types (“∗”), × and function domains (“ fin ”). We restrict ourselves to pointwise definition of functions, i.e. these functions→ are only defined for a finite subset of the domain X in f : X fin Y .Formostx X we have that f(x)= .Theentirelyundefined function→ is denoted by “ ”.∈ In inference rules, we reuse⊥ domain names as stems of meta-variables. Tuples⊥ and sequences are constructed via ,..., .Inorderto concatenate several lists to anewone,weusethenotation % ·: ...:· & .Updateof afunctionf for a point x to return y is denoted by f[x y].% Function· · & application is highlighted using the notation f @ x. '→

3ACoreTextTemplateLanguage

In this section, we provide formal semantics of the core concepts in template languages for text generation. For this purpose, we introduce TL ,afunctional language similar to StringTemplate, XPand, and even more to MOF⊥ M2T.

Syntax. Figure 2 shows a grammar of the template language TL .Webriefly ⊥ read its rules: A template collection tcoll consists of a list tmpl∗ of templates. A template declares a list targ∗ of arguments and a list tstmt∗ of template state- ments. Template statements include simple text, expression evaluation, template expansion, conditional statement, and iteration. Template languages typically comprise an expression language for navigating the source model. This sublanguage highly depends on the technological space of the source model. In the grammarware space, we need to navigate syntax trees. In the modelware space, we need to navigate object-oriented models. For this reason, we do not specify an expression sublanguage in detail. Instead, we specify several requirements for its syntax and semantics. Syntactically, we assume domains for expressions (texpr), types (ttype), and variables (tvn). Static semantics of templates 112 G. Wachsmuth

Domains τ = ttype (Expression types) T = tvn fin τ (Variable type table) → Θ = tn fin τ ∗ (Template type table) → Principal judgements tcoll (Well-typedness of template collections) " Θ tmpl : Θ (Extraction of type context) " Θ tmpl (Well-typedness of templates) " Θ,T tstmt (Well-typedness of template statements) " T texpr : τ (Well-typedness of template expressions) " L τ : τ (Decomposition of list types) " B τ (Booleaness of types) "

Fig.[Guido Wachsmuth: 3. Model “A Formal of TL Way fromstatic Text to Code semantics Templates”] ⊥ © 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle 37

Static semantics. The domains involved in the static semantics and the prin- cipal judgements are shown in Fig. 3. For the expression sublanguage, we require adomainoftypes(τ), a domain for variable environments (T )tostorevariable types, and a judgement of well-typedness of expressions under a given environ- ment. Furthermore, we require types for list-like data structures. We assume the judgment τ : τ ! to hold for a list type τ iff the list elements are of type τ !. !L The judgement B τ is expected to hold iff τ is boolean-like, that is, elements of this type can be! mapped to boolean values. As for the core concepts of TL ,wedefineadomainfortypetables(Θ) to store the argument types of templat⊥ es. Judgements concern the extraction of a type table for a given template collection, the overall well-typedness of a template collection, the well-typedness of a template definition for a given type table, and the well-typedness of a statement for a given type table and variable environment. Figure 4 lists the inference rules for the judgements. Since most of them are straightforward, we do not discuss all these rules in detail. Let us focus on the well-typedness judgement for statements. Θ,T tstmt is meant to hold if the statement tstmt is well-typed in the context of Θ !and T .TheparameterΘ covers the template types whereas T corresponds to an environment mapping variables for template arguments or iterations to types. The axiom [text]encodesthewell-typednessofanytext.Therule[exp]defines well-typedness for expression evaluation: An expression evaluation is well-typed if its expression has a type in the context of the current variable environment. The premises of rule [expand]saythatatemplateexpansioniswell-typed if (1) a table look-up for tn delivers the types of the template arguments and (2) the actual parameter types are subsumed by the formal ones. The rule [for]defineswell-typednessofiterationasfollows:(1)Thetypeτ of the iteration expression is determined under the current variable environment T .Thistypeneedstobealisttypewithacorrespondingtypeτ ! for the list A Formal Way from Text to Code Templates 115

Well-typedness of template collections tcoll !

tmpl : Θ1 Θn 1 tmpl : Θn ⊥! 1 ∧ ··· ∧ − ! n Θn tmpl Θn tmpl ∧ ! 1 ∧ ··· ∧ ! n [collection] tmpl ,...,tmpl !$ 1 n %

Extraction of type context Θ tmpl : Θ !

targ = ttype tvn1 targ = ttype tvnn 1 1 ∧ ··· ∧ n n Θ @ tn = Θ" = Θ[tn ttype ,...,ttype ] ∧ ⊥∧ &→$ 1 n % [extract] Θ define tn targ ,...,targ ... enddef : Θ" !( $ 1 %) ( )

Well-typedness of templates Θ tmpl !

targ = ttype tvn1 targ = ttype tvnn 1 1 ∧ ··· ∧ n n T = [tvn1 ttype ,...,tvnn ttype ] ∧ ⊥ &→ 1 &→ n Θ,T tstmt 1 Θ,T tstmt m ∧ ! ∧ ··· ∧ ! [template] Θ define tn targ ,...,targ !( $ 1 n %) tstmt 1,...,tstmt m $ % enddef ( )

[Guido Wachsmuth: “A Formal Way from Text to Code Templates”] Well-typedness of statements Θ,T tstmt © 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle ! 38

Θ,T text [text] ! T texpr : τ ! [expr] Θ,T texpr !( )

(1) Θ @ tn = τ1,...,τn $ % (2) T texpr : τ1 T texpr : τn ∧ ! 1 ∧ ··· ∧ ! n [expand] Θ,T expand tn texpr ,...,texpr !( $ 1 n %)

(1) T texpr : τ L τ : τ " ! ∧! (2) T " = T [tvn τ "] ∧ &→ (3) Θ,T" tstmt 1 Θ,T" tstmt n ∧ ! ∧ ··· ∧ ! [for] Θ,T for tvn in texpr tstmt 1,...,tstmt n endfor !( )$ %( )

(1) T texpr : τ B τ ! ∧! (2) Θ,T tstmt 1 Θ,T tstmt m ∧ ! ∧ ··· ∧ ! [if] Θ,T if texpr !( ) tstmt 1,...,tstmt n $ % else ( ) tstmt n+1,...,tstmt m $ % endif ( )

Fig. 6. Rules of TL dynamic semantics ⊥ A Formal Way from Text to Code Templates 115

Well-typedness of template collections tcoll !

tmpl : Θ1 Θn 1 tmpl : Θn ⊥! 1 ∧ ··· ∧ − ! n Θn tmpl Θn tmpl ∧ ! 1 ∧ ··· ∧ ! n [collection] tmpl ,...,tmpl !$ 1 n %

Extraction of type context Θ tmpl : Θ !

targ = ttype tvn1 targ = ttype tvnn 1 1 ∧ ··· ∧ n n Θ @ tn = Θ" = Θ[tn ttype ,...,ttype ] ∧ ⊥∧ &→$ 1 n % [extract] Θ define tn targ ,...,targ ... enddef : Θ" !( $ 1 %) ( )

Well-typedness of templates Θ tmpl !

targ = ttype tvn1 targ = ttype tvnn 1 1 ∧ ··· ∧ n n T = [tvn1 ttype ,...,tvnn ttype ] ∧ ⊥ &→ 1 &→ n Θ,T tstmt 1 Θ,T tstmt m ∧ ! ∧ ··· ∧ ! [template] Θ define tn targ ,...,targ !( $ 1 n %) tstmt 1,...,tstmt m $ % enddef ( )

Well-typedness of statements Θ,T tstmt ! Basically, check that all templates are Θ,T text [text] ! correctly invoked and that conditionals T texpr : τ ! and loops use a boolean condition and [expr] Θ,T texpr !( ) loops loop over a list.

(1) Θ @ tn = τ1,...,τn $ % (2) T texpr : τ1 T texpr : τn ∧ ! 1 ∧ ··· ∧ ! n [expand] Θ,T expand tn texpr ,...,texpr !( $ 1 n %)

(1) T texpr : τ L τ : τ " ! ∧! (2) T " = T [tvn τ "] ∧ &→ (3) Θ,T" tstmt 1 Θ,T" tstmt n ∧ ! ∧ ··· ∧ ! [for] Θ,T for tvn in texpr tstmt 1,...,tstmt n endfor !( )$ %( )

(1) T texpr : τ B τ ! ∧! (2) Θ,T tstmt 1 Θ,T tstmt m ∧ ! ∧ ··· ∧ ! [if] Θ,T if texpr !( ) tstmt 1,...,tstmt n $ % else ( ) tstmt n+1,...,tstmt m $ % endif ( ) [Guido Wachsmuth: “A Formal Way from Text to Code Templates”] © 2012 RalfFig. Lämmel, 6. SoftwareRules Language of TL Engineeringdynamic http://softlang.wikidot.com/course:sle semantics 39 ⊥ Dynamic semantics of templates 114 G. Wachsmuth

Domains ν (Expression values) β = true false (Boolean values) | ϕ = text (Text phrases) ψ = ϕ∗ T = tvn fin ν (Variable environment) → Θ = tn fin ( tvn ∗ tstmt ∗ ) (Template code table) → × Principal judgements tcoll Θ (Extraction of code table) # ⇒ Θ tmpl Θ Generate text # ⇒ T,Θ tstmt ψ (Execution of template statements) # ⇒ T texpr ν (Evaluation of template expressions) # ⇒ ν ϕ (Text conversion of values) # ⇒ L ν ν∗ (Decomposition of lists) # ⇒ B ν β (Boolean conversion of values) # ⇒

[GuidoFig. Wachsmuth: 5. Model “A Formal of TL Way fromdynamic Text to Code semantics Templates”] ⊥ © 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle 40 elements. (2) A new variable environment T ! is retrieved by updating the type of the iteration variable tvn to τ !.(3)Allstatementsintheiterationneedtobe well-typed under the new environment. The rule [if]forconditionalstatementsreadssimilarly:(1)Thetypeτ of the condition expression is determined. Thistypeneedstobeboolean-like.(2)All statements in a conditional statement need to be well-typed in the current context.

Dynamic semantics. While the static semantics is concerned with well-typed- ness judgements, the dynamic semantics provides judgements about text gener- ation. Figure 5 shows the model of the dynamic semantics. We use a style that emphasises the similarities of the models of static and dynamic semantics. While Θ covers the template types in the static case, it models the template code in the dynamic case. Similar variations apply to the principal judgements. That is, code table extraction corresponds to type table extraction, text generation out of template statements to well-typedness of template statements, and text gen- eration out of template expressions to type assignment for template expressions. For the expression sublanguage, we require a domain of values (ν), a domain for variable environments (T )tostorevariablevalues,andajudgementforeval- uating expressions under a given environment. The structure of this judgement implies side-condition free evaluation. This ensures model view separation [13]. Furthermore, we assume the judgment ν ν ,...,ν to hold for a list-like !L ⇒# 1 n $ data structure ν iff ν1,...,νn are the elements of this structure. The judgement B ν β is expected to hold iff ν can be converted into the boolean value β. Finally,! ⇒ we require a judgement ν ϕ for the conversion of values to text. Figure 6 lists the inference rules! for⇒ the judgements. Again, we do not dis- cuss all these rules in detail. Let us focus on the text generation judgement for Code table extraction tcoll Θ ￿ ⇒

tmpl Θ1 Θn 1 tmpl Θn ⊥￿ 1 ⇒ ∧ ··· ∧ − ￿ n ⇒ [collection] tmpl ,...,tmpl Θ ￿￿ 1 n ￿⇒

Code table extraction Θ tmpl Θ ￿ ⇒

targ = ttype tvn1 targ = ttype tvnn 1 1 ∧ ··· ∧ n n Θ￿ = Θ[tn tvn1,...,tvnn , tstmt 1,...,tstmt m ] ∧ ￿→ ￿ ￿ ￿ ￿ ￿￿ [extract] Θ define tn targ ,...,targ Θ￿ ￿￿ ￿ 1 n ￿￿ ⇒ tstmt 1,...,tstmt m ￿ ￿ enddef ￿ ￿

Statement evaluation T,Θ tstmt ψ ￿ ⇒

T,Θ text text [text] ￿ ⇒ T texpr ν ν ϕ ￿ ⇒ ∧￿⇒ [expr] T,Θ texpr ϕ ￿￿ ￿⇒

(1) Θ @ tn = tvn1,...,tvnn , tstmt 1,...,tstmt m ￿￿ [Guido Wachsmuth:￿ ￿ “A Formal Way from Text to Code￿￿ Templates”] (2) T texpr ν1 T texpr νn ∧ ￿ 1 ⇒ ∧ ··· ∧ ￿ n ⇒ (3) T ￿ = [tvn1© 2012ν 1Ralf,..., Lämmel,tvn Softwaren Languageνn] Engineering http://softlang.wikidot.com/course:sle 41 ∧ ⊥ ￿→ ￿→ (4) T ￿, Θ tstmt 1 ψ1 T ￿, Θ tstmt m ψm ∧ ￿ ⇒ ∧ ··· ∧ ￿ ⇒ (5) ψ = ψ1 : ...: ψm ∧ ￿ ￿ [expand] T,Θ expand tn texpr ,...,texpr ψ ￿￿ ￿ 1 n ￿￿⇒

(1) T texpr ν L ν ν1,...,νm ￿ ⇒ ∧￿ ⇒￿ ￿ (2) T1 = T [tvn ν1] Tm = T [tvn νm] ∧ ￿→ ∧ ··· ∧ ￿→ (3) T1, Θ tstmt 1 ψ1,1 T1, Θ tstmt n ψ1,n ∧ ￿ ⇒ ∧ ··· ∧ ￿ ⇒ ∧ ··· Tm, Θ tstmt 1 ψm,1 Tm, Θ tstmt n ψm,n ∧ ￿ ⇒ ∧ ··· ∧ ￿ ⇒ (4) ψ = ψ1,1 : ...: ψ1,n : ...: ψm,1 : ...: ψm,n ∧ ￿ ￿ [for] T,Θ for tvn in texpr tstmt 1,...,tstmt n endfor ψ ￿￿ ￿￿ ￿￿ ￿⇒

(1) T texpr ν B ν true ￿ ⇒ ∧￿ ⇒ (2) T,Θ tstmt 1 ψ1 T,Θ tstmt n ψn ∧ ￿ ⇒ ∧ ··· ∧ ￿ ⇒ (3) ψ = ψ1 : ...: ψn ∧ ￿ ￿ [then] T,Θ if texpr tstmt 1,...,tstmt n else ... endif ψ ￿￿ ￿￿ ￿￿ ￿ ￿ ￿⇒

(1) T texpr ν B ν false ￿ ⇒ ∧￿ ⇒ (2) T,Θ tstmt 1 ψ1 T,Θ tstmt n ψn ∧ ￿ ⇒ ∧ ··· ∧ ￿ ⇒ (3) ψ = ψ1 : ...: ψn ∧ ￿ ￿ [else] T,Θ if texpr ... else tstmt 1,...,tstmt n endif ψ ￿￿ ￿ ￿ ￿￿ ￿￿ ￿⇒

Fig. 6. Rules of TL dynamic semantics. ⊥ Code table extraction tcoll Θ ￿ ⇒

tmpl Θ1 Θn 1 tmpl Θn ⊥￿ 1 ⇒ ∧ ··· ∧ − ￿ n ⇒ [collection] tmpl ,...,tmpl Θ ￿￿ 1 n ￿⇒

Code table extraction Θ tmpl Θ ￿ ⇒

targ = ttype tvn1 targ = ttype tvnn 1 1 ∧ ··· ∧ n n Θ￿ = Θ[tn tvn1,...,tvnn , tstmt 1,...,tstmt m ] ∧ ￿→ ￿ ￿ ￿ ￿ ￿￿ [extract] Θ define tn targ ,...,targ Θ￿ ￿￿ ￿ 1 n ￿￿ ⇒ tstmt 1,...,tstmt m ￿ ￿ enddef ￿ ￿

Statement evaluation T,Θ tstmt ψ ￿ ⇒

T,Θ text text [text] ￿ ⇒ T texpr ν ν ϕ ￿ ⇒ ∧￿⇒ [expr] T,Θ texpr ϕ ￿￿ ￿⇒

(1) Θ @ tn = tvn1,...,tvnn , tstmt 1,...,tstmt m ￿￿ ￿ ￿ ￿￿ (2) T texpr ν1 T texpr νn ∧ ￿ 1 ⇒ ∧ ··· ∧ ￿ n ⇒ (3) T ￿ = [tvn1 ν1,...,tvnn νn] ∧ ⊥ ￿→ ￿→ (4) T ￿, Θ tstmt 1 ψ1 T ￿, Θ tstmt m ψm ∧ ￿ ⇒ ∧ ··· ∧ ￿ ⇒ (5) ψ = ψ1 : ...: ψm ∧ ￿ ￿ [expand] T,Θ expand tn texpr ,...,texpr ψ ￿￿ ￿ 1 n ￿￿⇒

(1) T texpr ν L ν ν1,...,νm ￿ ⇒ ∧￿ ⇒￿ ￿ (2) T1 = T [tvn ν1] Tm = T [tvn νm] ∧ ￿→ ∧ ··· ∧ ￿→ (3) T1, Θ tstmt 1 ψ1,1 T1, Θ tstmt n ψ1,n ∧ ￿ ⇒ ∧ ··· ∧ ￿ ⇒ ∧ ··· Tm, Θ tstmt 1 ψm,1 Tm, Θ tstmt n ψm,n ∧ ￿ ⇒ ∧ ··· ∧ ￿ ⇒ (4) ψ = ψ1,1 : ...: ψ1,n : ...: ψm,1 : ...: ψm,n ∧ ￿ ￿ [for] T,Θ for tvn in texpr tstmt 1,...,tstmt n endfor ψ ￿￿ ￿￿ ￿￿ ￿⇒

(1) T texpr ν B ν true ￿ ⇒ ∧￿ ⇒ (2) T,Θ tstmt 1 ψ1 T,Θ tstmt n ψn ∧ ￿ ⇒ ∧ ··· ∧ ￿ ⇒ (3) ψ = ψ1 : ...: ψn ∧ ￿ ￿ [then] T,Θ if texpr tstmt 1,...,tstmt n else ... endif ψ ￿￿ ￿￿ ￿￿ ￿ ￿ ￿⇒

(1) T texpr ν B ν false ￿ ⇒ ∧￿ ⇒ (2) T,Θ tstmt 1 ψ1 T,Θ tstmt n ψn ∧ ￿ ⇒ ∧ ··· ∧ ￿ ⇒ (3) ψ = ψ1 : ...: ψn ∧ ￿ ￿ [else] T,Θ if texpr ... else tstmt 1,...,tstmt n endif ψ ￿￿ ￿ ￿ ￿￿ ￿￿ ￿⇒

[GuidoFig. Wachsmuth: 6. Rules “A Formal of TL Waydynamic from Text to semantics.Code Templates”] ⊥ © 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle 42 Language-aware template languages

Make sure that only syntactically correct code is produced.

Thus, templates have result types: nonterminals of target language’s grammar.

Grammar transformations come to the rescue: amalgamate template and target syntax by enhancing target language’s grammar to include templates.

© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle 43 Informal enhancement description

For each nonterminal of the target language’s grammar: Enable template expansion with such a result type. Enable conditionals with such a result type. Enable template definitions with such a result type. Enable nonterminal to generate text as statement. Enable the nonterminal as “type”.

© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle 44 Informal enhancement description (cont’d)

Some special cases: Handle morphemes. Degenerate case of normal nonterminals. Handle */+ symbols. Extract (“fold”) new nonterminal for symbol. Enable loops with such a result type.

© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle 45 Grammar adaptation for adding template support to the grammar of a language

introduce texpr ttype tvn tn introduce targ = ttype tvn Introduce template grammar symbols foreach nt : N include nt = expand tn texpr ∗ ￿ ￿ include nt = if texpr ∗ nt else nt endif ￿ ￿ ￿ ￿ ￿ ￿ include tmpl = define nt tn targ ∗ nt enddef ￿ ￿ ￿ ￿ ￿ ￿ include tstmt = nt include dn = nt ￿ ￿ endfor Instrument nonterminals foreach nt : M fold ntM = nt include ntM = texpr ￿ ￿ Special cases for morphemes. include tstmt = ntM include dn = nt ￿ ￿ endfor foreach nt : L fold ntL = nt include ntL = for tvn in texpr nt endfor ￿ ￿ ￿ ￿ Another rule include tstmt = ntL endfor from the generic introduce tcoll = tmpl ∗ Instrument nonterminals

grammar [Guido Wachsmuth: “A Formal Way from Text to Code Templates”] Fig. 7. Generic adaptation script for the syntactical enhancement of a target language. © 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle 46

each nonterminal except morphem classes. We include rules for template expan- sion and for a conditional statement. Additionally, we include a rule for tem- plates. Third, we enhance the definition of each morphem class. We fold each morphem class to a fresh nonterminal and include a rule for expression evalu- ation. Fourth, we enhance the definition of nonterminals occurring in a Kleene closure. We fold each of these domains to a fresh nonterminal and include a rule for iteration. Finally, we introduce template collections. The result of the adaptation is a grammar for a template language particu- larly concerned with the target language. In this template language, templates are associated with a particular syntactical domain of the target language. Fig- ure 8 gives an example. The upper part of the figure shows the grammar of a simple programming language PL: A program prog consists of a list of state- ments pstmt ∗. A statement is either a variable declaration, an assignment, a while loop, or a conditional statement. An expression pexpr is either an integer number, a character string, a variable, a sum, a difference, or a string concate- nation. The lower part of Fig. 8 shows the resulting grammar for the template language TLPL.

Static semantics. During the syntactical enhancement, two helper domains tstmt and dn are constructed. This allows us to use a generic model of the static semantics. The model differs only slightly from the model for TL . Figure 9 high- lights the modifications. In Θ, we keep the syntactical domain of⊥ the template in An operator suite for grammar adaptation 110 G. Wachsmuth

adaptation = step∗ step = introduce rrule include rule fold rule | | foreach vn : yielder do step∗ endfor | rrule = rule nt ∗ | rule = nt = elem ∗ elem = nt vn kw vn elem ∗ | | | ! " | yielder = N M L | | nt nonterminals vn variable names kw keywords

[Guido Wachsmuth: “A Formal Way from Text to Code Templates”]

© 2012 Ralf Lämmel,Fig. Software 1. LanguageAn Engineering operator http://softlang.wikidot.com/course:sle suite for grammar adaptation47

2Preliminaries

Grammar adaptation. In this paper, we employ formal grammar transforma- tions for the stepwise adaptation of grammars [9]. This approach offers several benefits: First, the derivation of the template language is formally defined by the application of well-defined adaptation operators. Second, we prevent accidental modifications of the target language since preservation properties of adaptation operators are well understood [9]. Third, the adaptation is generic and can be applied to any target language. Finally, existing tool support automates the derivation process [11]. We rely on a very restricted operator suite for this paper. Figure 1 shows its concrete syntax: First, we can introduce anewruleforafreshnonterminal.We use the same operator to introduce nonterminals without a defining rule (e.g. for morphem classes). Second, we can include an additional rule for a nonterminal. If the nonterminal is yet undefined, include operates as introduce.Third,we can fold aphrase.Thisintroducesafreshnonterminaldefinedbythephrase.All occurrences of the phrase are replaced by the nonterminal. All these operators are purely constructive, i.e. they extend the language defined by the grammar [9]. Finally, we can execute operators foreach nonterminal in a set. There are three yielders of such sets: N yields all nonterminals except morphem classes, M yields all morphem classes, and L yields all nonterminals to which a Kleene closure is applied. Inside a foreach block, a variable provides access to the current nonterminal. yields a nonterminal’s name as a keyword. !·" Natural Semantics. In this paper, we use Natural Semantics [10] for the static and dynamic semantics of template languages. Typically, template languages are functional languages [6]. Describing functional languages by Natural Semantics is well understood. For example, Natural Semantics was used for both the static and dynamic semantics in the formal specification of Standard ML [12]. The general idea of a semantics definition in Natural Semantics is to provide axioms and inference rules for judgements over syntactical and semantical domains.A static semantics is typically described in terms of well-typedness judgements for Summary of the method

Definition of generic template language. Syntax, static and dynamic semantics Amalgamation with target language’s grammar. Use grammar transformation systematically. Additional amalgamation with semantics (omitted).

[Guido Wachsmuth: “A Formal Way from Text to Code Templates”]

© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle 48 Multi-stage programming

Compute source code any time.

© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle 49 Motivation to MSP

© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle 50 Why Multi-stage Programming?

Reasons are purely economic: •Problem 1:Abstraction mechanisms (functions, objects, classes…) have runtime cost –Result 1: You don’t use them –Result 2: That costs you productivity •Idea: Use generative programming (GPCE) •Problem 2:Writing good generative programs is hard

[Walid Taha: Tutorials at GPCE 2004 & GTTSE 2005]

© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle 51 Why program generation is hard •First, there is the issue of hygiene (as in macros) • More generally, what can we statically guarantee about what’s generated? •Important: “Statically” = by looking at the generator? Build Combine Syntactic Type Approach correctness? correctness? “f (x,y)” F X Reject “f (x,)” Reject “7 (8)” String 3 3 2 2

Datatype 2 3 3 2

[Walid Taha: Tutorials at GPCE 2004 & GTTSE 2005]

© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle 52 Multi-stage programming (MSP) •Provide abstraction mechanisms like: polymorphism, higher-order functions, exceptions, … •Provide constructs that allow “generation” •Withoutdamaging the static typing discipline

Build Combine Syntactic Type Approach correctness? correctness? “f (x,y)” F X Reject “f (x,)” Reject “7 (8)” String 3 3 2 2

Datatype … 3 3 2

MSP 3 3 3 3

[Walid Taha: Tutorials at GPCE 2004 & GTTSE 2005]

© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle 53 The abstract view

I2

I1 P O

Batch Programming Ex.:

[Walid Taha: Tutorials at GPCE 2004 & GTTSE 2005]

© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle 54 The abstract view

I2

I1 P1 P2 O

Multi-Stage Programming (MSP) Ex.: Compiler

[Walid Taha: Tutorials at GPCE 2004 & GTTSE 2005]

© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle 55 Staging the exponentiation function

© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle 56 Small example: Exponentiation

x0 = 1 Minimal example of a generic program. x2n+1 = x*x2n Such a program is x2n = (xn)2 ! almost never used in practice. x=2 Why?

n=5 P xn=32

[Walid Taha: Tutorials at GPCE 2004 & GTTSE 2005]

© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle 57 Small example: Exponentiation

0 5 x = 1 P2(x) = x x2n+1 = x*x2n = x*x4 2n n x = (x )2 2 2 ! = x*(x ) = x*((x1)2)2 x Just 4 ops! = x*((x*x0)2)2

5 2 2 5 P1 P2 x = x*((x*1) )

??? No recursion!

[Walid Taha: Tutorials at GPCE 2004 & GTTSE 2005]

© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle 58 A Naïve MSP method

I2 1. Write traditional, single-stage program I1 P O

I2 2. Add staging annotations: brackets, escapes, I1 P1 P2 O and run

[Walid Taha: Tutorials at GPCE 2004 & GTTSE 2005]

© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle 59 MetaOCaml [GPCE ’03] •Builds on the OCamlsystem – Very successful, widely used, statically typed language – Compiled (both byte-code and native code) •Has competitive performance (even for numerical apps) •Allows the collection of credible performance numbers –Substantial existing code base •Realized by a source-to-source translation –Based on an elegant formal model –Has a clear underlying cost model (parse tree)

[Walid Taha: Tutorials at GPCE 2004 & GTTSE 2005]

© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle 60 Small example: Exponentiation x

Single stage: n P xn let rec exp(n:int, x:real):real = if n = 0 then 1.0 else if even (n) then sqr (exp(n div 2,x)) else x * (exp(n - 1,x));;

[Walid Taha: Tutorials at GPCE 2004 & GTTSE 2005]

© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle 61 Small example: Exponentiation x

n Two stage: n P1 P2 x let rec exp(n:int, x:): = if n = 0 Only things in then <1.0> green left for second stage else if even (n) then else <~x * ~(exp(n - 1,x))>;;

[Walid Taha: Tutorials at GPCE 2004 & GTTSE 2005]

© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle 62 Small example: Exponentiation

To use the staged exp we simply write: ! ~(exp(5, ))> ;; The result is: fn x => x*(sqr(sqr(x*1.0)))

x*((x*1)2)2

[Walid Taha: Tutorials at GPCE 2004 & GTTSE 2005]

© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle 63 Timing in MetaOCaml (Careful!)

# #use "mex/3.ml";; (* loads “power” *) .Timing . . Timing in MetaOCaml in MetaOCaml (Careful!) (Careful!) __ Unstaged ______131072x avg= 2.400000E-03 msec __ Generating ______9192x avg= 5.670000E-02 msec # #use "mex/3.ml";;# #use "mex/3.ml";; (* loads “power” (* loads *) “power” *) __ Compiling ______512x avg= 2.000000E+00 msec ...... __ Stage 2 ______8388608x avg= 3.000000E-04 msec __ Unstaged______Unstaged ______131072x avg= 1310 2.400000E-0372x avg= 2.400000E-03 msec msec - : unit = () __ Generating__ Generating______9192x avg= 915.670000E-0292x avg= 5.670000E-02 msec msec __ Compiling______Compiling ______512x avg= 2.000000E+00512x avg= 2.000000E+00 msec msec The__ Stage timenew 2 ______Stagefunction 2 ______makes 8388608x life avg= 83886 easier 3.000000E-0408x avg= 3.000000E-04 msec msec - : unit =- (): unit = () From the above data we can compute:

The“Speedup” timenewThe timenew function = (2.4E-3) makesfunction / (3.0E-4) life makes easier life= 8easier x From the [Walidabove Taha: Tutorials at GPCEdata 2004 & GTTSE 2005]we can compute: From the above© 2012 Ralf Lämmel, data Software Language we Engineering can http://softlang.wikidot.com/course:sle compute: 64 “Speedup”“Speedup” = (2.4E-3) = /(2.4E-3) (3.0E-4) / (3.0E-4) = 8 x = 8 x MSP details

© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle 65 Multi-stage programming (MSP)

Three staging annotations: Construct Example Result Brackets: a = <2*4> a=<2*4> Escape: b = <9+~a> b=<9+2*4> Run: c = !b c=17

Typing rules (simplified): Term Type Term Type X T X ~X T X !X T

[Walid Taha: Tutorials at GPCE 2004 & GTTSE 2005]

© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle 66 Related techniques for “staging” There is a number of ways for looking at MSP: •Statically typed macros – But not limited to compile-time •Partial evaluation (Mix, Schism, Similix, Tempo) –But programmer has full control over what is specialized –(in some cases, programmer has too much control) •High-level program generation (SDRR, Genvoca, P++) •Runtime code generation (Fabius, DyC, Dynamo, `C) Common goal: Alter order of evaluation to reduce cost of a computation

[Walid Taha: Tutorials at GPCE 2004 & GTTSE 2005]

© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle 67 Where can staging be used?

Lots of applications, including: •Operating systems [Pu et al] •Data base systems [Batory et al] •Satellite software [Czarnezki et al] •Domain specific languages [Consel et al] •Graphics software [Mogensen] No shortage in applications. Rather, in methodology

[Walid Taha: Tutorials at GPCE 2004 & GTTSE 2005]

© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle 68 Typical speedups (MetaOCaml)

I2 Program Speedup

T(P)/T(P2) I1 P O Exp 5…8 x Interp(fib 15) 4 x

I2 Interp’(fib 15) 44 x RewriteCPS 13 x Chebyshev 3 x I1 P1 P2 O

What kinds of programs can we stage? Generic programs that pay an unnecessary cost for this generality.

[Walid Taha: Tutorials at GPCE 2004 & GTTSE 2005]

© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle 69 “Delays lead to dangerous ends”

n P ?! Be careful with recursion/loops… 1 let rec exp(n:int, x:):= ;

[Walid Taha: Tutorials at GPCE 2004 & GTTSE 2005]

© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle 70 Basic inlining using MSP

Instead of getting x * (sqr (sqr (x))> It would be nice if we can get let y=x*x in x*(y*y)> To do this, we need to consider the definition of sqr, (and change the staged exp function slightly)

[Walid Taha: Tutorials at GPCE 2004 & GTTSE 2005]

© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle 71 Small example: Exponentiation x

n Two stage: n P1 P2 x let rec exp(n:int, x:):= if n = 0 Can we get rid then <1.0> of this sqr operation? else if even (n) then else <~x * ~(exp(n - 1,x))>;

[Walid Taha: Tutorials at GPCE 2004 & GTTSE 2005]

© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle 72 In-lining using MSP

Original definition let sqr x = x * x Naïve staged version let sqr x = <~x * ~x> Note change in type: Before: int->int After: -> Must therefore change exp

[Walid Taha: Tutorials at GPCE 2004 & GTTSE 2005]

© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle 73 Small example: Exponentiation x

n Two stage: n P1 P2 x let rec exp(n:int, x:): = Now our main if n = 0 function has then <1.0> even less green else if even (n) then sqr (exp(n div 2,x)) else <~x * ~(exp(n - 1,x))>;

[Walid Taha: Tutorials at GPCE 2004 & GTTSE 2005]

© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle 74 Basic inlining using MSP

Problem: Using: let sqr x = <~x * ~x> we get code explosion: We generate x * ((x * x) * (x * x))> Solution (from PE literature!) let sqr x = Now we get (basically): let y=x*x in x * (y*y)>

[Walid Taha: Tutorials at GPCE 2004 & GTTSE 2005]

© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle 75 Partial Application vs. Partial Evaluation

“Curried application” is former, but NOT latter. For example: F x y z = x+y+z Is same as: F x y = fun z -> x+y+z Now F 1 2 evaluates to fun z -> 1+2+z but NOT: fun z -> 3+z. Last step requires reduction under O (fun z) F’ x y = ~(lift(x+y))+z>

[Walid Taha: Tutorials at GPCE 2004 & GTTSE 2005]

© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle 76 Capture is a non-problem in MSP

Given the following definitions let t(y)= x + ~y> let p = ~(t ) 10> We do NOT get p = (fun x-> x+x)10> But rather

p = (fun x2-> x2+x1)10> With MSP, we don’t need to worry about names

[Walid Taha: Tutorials at GPCE 2004 & GTTSE 2005]

© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle 77 Why is type safety challenging?

Type system should reject “bad” terms: •Basics: ~5, run 5 •Levels: •Closedness: )in …> o In the last case x is treated as having a code type, and run is used during “specialization”

[Walid Taha: Tutorials at GPCE 2004 & GTTSE 2005]

© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle 78 Staging a simple interpreter

© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle 79 Domain Specific Languages (DSLs)

Examples •Lex, yacc, VHDL, LaTeX, Excel, Tkl/tk, etc Benefits •Abstract away unnecessary details •Improve –Performance – Reliability, maintainability … = productivity A functional MSP language can be particularly useful for implementing DSLs

[Walid Taha: Tutorials at GPCE 2004 & GTTSE 2005]

© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle 80 Syntax in (Meta)OCaml type exp = Int of int | Var of string | App of string * exp | Add of exp * exp | Sub of exp * exp | Mul of exp * exp | Div of exp * exp | Ifz of exp3 type def = Declaration of string * string * exp type prog = Program of def list * exp

[Walid Taha: Tutorials at GPCE 2004 & GTTSE 2005]

© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle 81 Syntax in (Meta)OCaml

Represent let rec f x = if x=0 then 1 else x*(f(x-1)) in f 10 As let termFact = Program ([Declaration ("f","x", Ifz(Var "x", Int 1,Mul(Var"x",(App ("f", Sub(Var "x",Int 1))))))], App ("f", Int 10))

[Walid Taha: Tutorials at GPCE 2004 & GTTSE 2005]

© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle 82 The natural benchmark

Note that we can type in what we want to represent directly into OCaml and run it: let rec f x = if x=0 then 1 else x*(f(x-1)) in f 10 This is NOT a generic solution. But IT IS a natural benchmark for what constitutes good performance.

[Walid Taha: Tutorials at GPCE 2004 & GTTSE 2005]

© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle 83 env, fenv Interpreter Runtime environments: e P i (* env : string -> int fenv : string -> (int -> int) *) exception Yikes let env0 = fun x -> raise Yikes let fenv0 = env0 let ext env x v = fun y -> if x=y then v else env y

[Walid Taha: Tutorials at GPCE 2004 & GTTSE 2005]

© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle 84 env, fenv Interpreter A safe interpreter for just expressions: e P i let rec eval3 e env fenv = match e with Int i -> Some i ... | Div (e1,e2) -> (match (eval3 e1 env fenv, eval3 e2 env fenv) with (Some x, Some y)-> if y=0 then None else Some (x/y) | _ -> None) ...

[Walid Taha: Tutorials at GPCE 2004 & GTTSE 2005]

© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle 85 env, fenv Interpreter Same for programs/declarations p P i let rec peval3 p env fenv= match p with Program ([],e) -> eval3 e env fenv |Program (Declaration (s1,s2,e1)::tl,e) -> let rec f x = eval4 e1 (ext env s2 x) (ext fenv s1 f) in peval3 (Program(tl,e)) env ...

[Walid Taha: Tutorials at GPCE 2004 & GTTSE 2005]

© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle 86 env, fenv Interpreter •The interpreter is generic p P i –It will evaluate any program at runtime (even ones provided by the user) •The interpreter is simple and maintainable –It is essentially a specification of the semantics of the language –(Actually, it’s almost a denotationalsemantics) •It runs 21x slower than our benchmark –Costly generic solutions are impractical, forcing us to resort to ad hoc solutions…

[Walid Taha: Tutorials at GPCE 2004 & GTTSE 2005]

© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle 87 env, fenv Staged Interpreter

We can traverse the program early: e P1 P2 i let rec eval4 e env fenv = match e with Int i -> ... | Div (e1,e2) -> <(match (~(eval3 e1 env fenv), ~(eval3 e2 env fenv)) with (Some x, Some y)-> if y=0 then None else Some (x/y) | _ -> None) ... >

[Walid Taha: Tutorials at GPCE 2004 & GTTSE 2005]

© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle 88 env, fenv Staged Interpreter

Same for programs: p P1 P2 i let rec peval4 p env fenv= match p with Program ([],e) -> eval4 e env fenv |Program (Declaration (s1,s2,e1)::tl,e) -> ) (ext fenv s1 )) in ~(peval4 (Program(tl,e)) env ... >

[Walid Taha: Tutorials at GPCE 2004 & GTTSE 2005]

© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle 89 env, fenv Staged Interpreter

•Staging helps p P1 P2 i –We get code 4x faster than the original interpreter – But that means it’s 5x slower than our benchmark… •We would like to generate the following code: •But what does it actually generate?

[Walid Taha: Tutorials at GPCE 2004 & GTTSE 2005]

© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle 90 env, fenv Staged Interpreter

•Generated code: p P1 P2 i if x = 0 then Some 1 ... | None -> None in match Some 10 with Some x -> f x | None -> None> •What happened?

[Walid Taha: Tutorials at GPCE 2004 & GTTSE 2005]

© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle 91 env, fenv Staged Interpreter

•Generated code: p P1 P2 i if x = 0 then Some 1 ... | None -> None in match Some 10 with Some x -> f x | None -> None> •We are tagging and untagging…

[Walid Taha: Tutorials at GPCE 2004 & GTTSE 2005]

© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle 92 env, fenv Staged Interpreter

p P1 P2 i

If there is ONE thing you want to take from this tutorial: Avoid the temptation to find ways to post- process the code you generate. It can be arbitrarily better (in terms of performance and quality) to avoid generating bad code in the first place.

[Walid Taha: Tutorials at GPCE 2004 & GTTSE 2005]

© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle 93 env, fenv Staged Interpreter

The source of the problem: e P1 P2 i let rec eval4 e env fenv = match e with Int i -> ... | Div (e1,e2) -> <(match (~(eval3 e1 env fenv), ~(eval3 e2 env fenv)) with (Some x, Some y)-> if y=0 then None else Some (x/y) | _ -> None) ... >

[Walid Taha: Tutorials at GPCE 2004 & GTTSE 2005]

© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle 94 env, fenv Interpreter (in CPS) let rec eval5 e env fenv k = e P i match e with Int i -> k (Some i) ... | Div (e1,e2) -> eval5 e1 env fenv (fun r -> eval5 e2 env fenv (fun s -> match (r,s) with (Some x, Some y) -> if y=0 then k None else k (Some (x/y)) | _ -> k None)) ...

[Walid Taha: Tutorials at GPCE 2004 & GTTSE 2005]

© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle 95 env, fenv Staged Interpreter let rec eval6 e env fenv k = e P1 P2 i match e with Int i -> k (Some ) ... | Div (e1,e2) -> eval6 e1 env fenv (fun r -> eval6 e2 env fenv (fun s -> match (r,s) with (Some x, Some y) -> ))> | _ -> k None))

[Walid Taha: Tutorials at GPCE 2004 & GTTSE 2005]

© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle 96 Summary of MSP

Staging can lead to substantial speedups. Staging requires perhaps only minor annotations. Programmer can see when things happen. Programmer can understand why things are faster. Staging is closely related to templates, macros, partial evaluation, generative programming, and run-time code generation. Staging is based on sound and simple reasoning principles.

© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle 97 Further reading on MSP

Multi-stage Programming Homepage http://www.cs.rice.edu/~taha/MSP/ The Converge programming language http://convergepl.org/ Tiark Rompf, Martin Odersky: “Lightweight modular staging: a pragmatic approach to runtime code generation and compiled DSLs”. http://dx.doi.org/10.1145/1942788.1868314

© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle 98 Hygienic macros

Avoid name capture of kinds. (Only discussed in passing this time.)

© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle 99 The hygiene problem in C with preprocessing

#define INCI(i) {int a=0; ++i;} int main(void) { int a = 0, b = 0; INCI(a); INCI(b); printf("a is now %d, b is now %d\n", a, b); return 0; } int main(void) { int a = 0, b = 0; {int a=0; ++a;}; {int a=0; ++b;}; printf("a is now %d, b is now %d\n", a, b); return 0; } a is now 1, b is now 1 a is now 0, b is now 1

[http://en.wikipedia.org/wiki/Hygienic_macro, 10 December 2012]

© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle 100 The hygiene problem with Common Lisp macros

(defmacro my-unless (condition &body body) `(if (not ,condition) (progn ,@body))) (flet ((not (x) x)) Local redefinition (my-unless t (format t "This should not be printed!"))) of “not”.

“Redefining standard functions and operators, globally or locally, invokes undefined behavior according to ANSI Common Lisp. Such usage can be diagnosed by the implementation as erroneous.”

[http://en.wikipedia.org/wiki/Hygienic_macro, 10 December 2012]

© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle 101 The hygiene problem with Common Lisp macros

(defmacro my-unless (condition &body body) `(if (user-defined-operator ,condition) (progn User-defined as ,@body))) opposed to built-in.

(flet ((user-defined-operator (x) x)) (my-unless t (format t "This should not be printed!")))

“The Common Lisp solution to this problem is to use packages. The my-unless macro can reside in its own package, where user- defined-operator is a private symbol in that package. The symbol user-defined-operator occurring in the user code will then be a different symbol, unrelated to the one used in the macro.”

[http://en.wikipedia.org/wiki/Hygienic_macro, 10 December 2012]

© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle 102 Solution strategies

Try to use unique names for the macro parameters. Generate unique symbols as they are needed. Generally disallow shadowing of names. Generally disallow redefinitions of built-in functions. Generally disallow all redefinitions globally. Protect names by making them package-private. Resolve clashes by hygienic transformation. ...

© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle 103 Hygienic macros in Scheme

“Meanwhile, languages such as Scheme that use hygienic macros prevent accidental capture and ensure referential transparency automatically as part of the macro expansion process. [...] For example, (define-syntax my-unless the following Scheme implementation of my-unless (syntax-rules () will have the desired behavior:” [(_ condition body) (if (not condition) body (void))]))

(let ([not (lambda (x) x)]) (my-unless #t (displayln "This should not be printed!")))

[http://en.wikipedia.org/wiki/Hygienic_macro, 10 December 2012]

© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle 104 Further reading on hygienic macros

E. E. Kohlbecker, M. Wand: "Macro-by-example: Deriving syntactic transformations from their specifications". http://dx.doi.org/10.1145/41625.41632 Eugene Kohlbecker, Daniel P. Friedman, Matthias Felleisen, Bruce Duba: "Hygienic macro expansion". http://dx.doi.org/10.1145/319838.319859 David Herman, Mitchell Wand: "A theory of hygienic macros". http://dl.acm.org/citation.cfm?id=1792878.1792884 Steven E. Ganz, Amr Sabry, Walid Taha: "Macros as multi-stage computations: type-safe, generative, binding macros in MacroML". http://dx.doi.org/10.1145/507546.507646 Walid Taha, Patricia Johann: "Staged notational definitions". http://dl.acm.org/citation.cfm?id=954186.954192

© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle 105 Concluding remarks

Template and macros systems are ubiquitous in programming. Such systems clearly interact with syntax, types, and semantics. “Safety” is provided only by some systems to some extent. “Methodology” is still maturing. Language engineers need to use, tame, and develop such systems. Also, think of reverse/re-engineering systems that use templates et al.

© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle 106

Name Total Salary Cut salaries Show details
Cut