<<

A -Compiler for DSL Embedding

Amir Shaikhha Vojin Jovanovic Christoph Koch EPFL, Switzerland Oracle Labs EPFL, Switzerland {amir.shaikhha}@epfl.ch {vojin.jovanovic}@oracle.com {christoph.koch}@epfl.ch

Abstract (EDSLs) [14] in the Scala . DSL devel- In this paper, we present a framework to generate compil- opers define a DSL as a normal in Scala. This plain ers for embedded domain-specific languages (EDSLs). This Scala implementation can be used for purposes framework provides facilities to automatically generate the without worrying about the performance aspects (handled boilerplate code required for building DSL on top separately by the DSL compiler). of extensible optimizing compilers. We evaluate the practi- Alchemy provides a customizable set of annotations for cality of our framework by demonstrating several use-cases encoding the domain knowledge in the optimizing compila- successfully built with it. tion frameworks. A DSL developer annotates the DSL library, from which Alchemy generates a DSL compiler that is built CCS Concepts • and its engineering → Soft- on top of an extensible . As opposed to ware performance; Compilers; the existing compiler-compilers and language workbenches, Alchemy does not need a new meta-language for defining Keywords Domain-Specific Languages, Compiler-Compiler, a DSL; instead, Alchemy uses the reflection capabilities of Language Embedding Scala to treat the plain Scala code of the DSL library as the language specification. 1 Introduction A compiler expert can customize the behavior of the pre- Everything that happens once can never happen defined set of annotations based on the features provided by again. But everything that happens twice will a particular optimizing compiler. Furthermore, the compiler surely happen a third time. expert can extend the set of annotations with additional ones – Paulo Coelho, The Alchemist for encoding various domain knowledge in an optimizing Domain-specific languages (DSLs) have gained enormous compiler. success in providing productivity and performance simulta- This paper is organized as follows. In Section 2 we review neously. The former is achieved through their concise syntax, the background material and related work. Then, in Section 3 while the latter is achieved by using specialization and com- we give a high-level overview of the Alchemy framework. In pilation techniques. These two significantly improve DSL Section 4 we present the process of generating a DSL com- users’ programming experience. piler in more detail. Section 5 presents several use cases built Building DSL compilers is a time-consuming and tedious with the Alchemy framework. Finally, Section 6 concludes. task requiring much boilerplate code related to non-creative aspects of building a compiler, such as the definition of inter- 2 Background & Related Work mediate representation (IR) nodes and repetitive transforma- In this section, we present the background and related work tions [2]. There are many extensible optimizing compilers to better understand the design decisions behind Alchemy. to help DSL developers by providing the required infrastruc- ture for building compiler-based DSLs. However, the existing 2.1 Compiler-Compiler arXiv:1808.01344v1 [cs.PL] 3 Aug 2018 optimizing compilation frameworks suffer from a steep learn- A compiler-compiler (or a meta compiler) is a program that ing curve, which hinders their adoption by DSL developers generates a compiler from the specification of a program- who lack compiler expertise. In addition, if the API of the ming language. This specification is usually expressed ina underlying extensible optimizing compiler changes, the DSL declarative language, called a meta-language. developer would need to globally refactor the code base of [17] is a compiler-compiler for generating parsers the DSL compiler. specified using a declarative language. There are numerous The key contribution of this paper is to use a generative systems for defining new languages, referred to as language approach to help DSL developers with the process of build- workbenches [9, 12], such as Stratego/Spoofax [19], Sug- ing a DSL compiler. Instead of asking the DSL developer to arJ [7], Sugar* [8], KHEPERA [10], and MPS [16]. provide the boilerplate code snippets required for building a DSL compiler, we present a framework which automatically generates them. 2.2 Domain-Specific Languages More specifically, we present Alchemy, a language work- DSLs are programming languages tailored for a specific do- bench [9, 12] for generating compilers for embedded DSLs main. There are many successful examples of systems using DSLs in various domains such as SQL in manage- DSL Developer Compiler Expert ment, Spiral [28] for generating digital signal processing Kiama LMS SC Kiama LMS SC kernels, and Halide [29] for image processing. The software Plugin Plugin Plugin Annotated Alchemy development process can also be improved by using DSLs, Shallow IR Nodes Deep EDSL Expression EDSL referred to as language-oriented programming [40]. Cade- Lifting lion [23] is a language workbench developed for language- Annotations oriented programming. Compiler-Compiler Compiler There are two kinds of DSLs: 1) external DSLs which have a stand-alone compiler, and 2) embedded DSLs [14] (EDSLs) Figure 1. Overall design of Alchemy. which are embedded in another generic-purpose program- ming language, called a host language. Various EDSLs have been successfully implemented in dif- EDSL using a particular extensible optimizing compiler. In ferent host languages, such as Haskell [1, 14, 24] or Scala [21, other words, Alchemy converts an interpreter for a language 25, 30, 33]. The main advantage of EDSLs is reusing the ex- (a shallow EDSL) to a compiler (a deep EDSL). isting infrastructure of the host language, such as the parser, Truffle [15] provides a DSL for defining self-optimizing the type checker, and the IDEs among others. AST interpreters, using the Graal [41] optimizing compiler as There are two ways to define an EDSL. The first approach the backend. This system mainly focuses on providing just- is by defining it as a plain library in the host language, re- in-time compilation for dynamically typed languages such ferred to as shallowly embedding it in the host language. as JavaScript and R, by annotating AST nodes. In contrast, A shallow EDSL is reusing both the frontend and backend Alchemy uses annotation on the library itself and generates components of the host language compiler. However, the the AST nodes based on strategy defined by the compiler opportunities for domain-specific optimizations are left un- expert. exploited. In other words, the library-based implementation Forge [38] is an embedded DSL in Scala for specifying of the EDSL in the host language is served an interpreter. other DSLs. Forge is used by the Delite [21] and LMS [30, 31] The second approach is deeply embedding the DSL in the compilation frameworks. This approach requires DSL devel- host language. A deep EDSL is only using the frontend of the opers to learn a new specification language before imple- host language, and requires the DSL developer to implement menting DSLs. In contrast, Alchemy developers write a DSL a backend for the EDSL. This way, the DSL developer can specification using plain Scala code. Then, domain-specific leverage domain-specific opportunities for optimizations knowledge is encoded using simple Alchemy annotations. and can leverage different target backends through code Yin-Yang [18] uses Scala macros for automatically convert- generation. ing shallow EDSLs to the corresponding deep EDSLs. Thus, it completely removes the need for providing the definition 2.3 Extensible Optimizing Compilers of a deep DSL library from the DSL developer. However, con- There are many extensible optimizing compilers which pro- trary to our work, the compiler-compilation of Yin-Yang is vide facilities for defining optimizations and code genera- specific to the LMS30 [ ] compilation framework. Also, Yin- tion for new languages. Such frameworks can significantly Yang does not generate any code related to optimizations of simplify the development of the backend component of the the DSL library. We have identified the task of automatically compiler for a new programming language. generating the optimizations to be not only a crucial require- Stratego/Spoofax [19] uses strategy-based term-rewrite ment for DSL developers but also one that is significantly systems for defining domain-specific optimizations for DSLs. more complicated than the one handled by Yin-Yang. Stratego uses an approach similar to quasi-quotation [39] to hide the expression terms from the user. For the same pur- 3 Overview pose, Alchemy uses annotations for specifying a subset of Figure 1 shows the overall design of the Alchemy framework. optimizations specified by the compiler expert. One canuse Alchemy is implemented as a compiler plugin for the Scala quasi-quotes [26, 34] for implementing domain-specific opti- programming language.1 After and type checking mizations in concrete syntax (rather than abstract syntax) the library-based implementation of an EDSL, Alchemy uses similar to Stratego. the type-checked Scala AST to generate an appropriate DSL compiler. The generated DSL compiler follows the API pro- 2.4 What is Alchemy? vided by an extensible optimizing compiler to implement Alchemy is a compiler-compiler, designed for EDSLs that transformations and code generation needed for that DSL. use Scala as their host language. Alchemy uses the Scala language itself as its meta-language; it takes an annotated li- 1We decided to implement Alchemy as a compiler plugin rather than using brary as the implementation of a shallow EDSL and produces the system of Scala, due to the restrictions imposed by def macros the required boilerplate code for defining a backend for this and macro annotations. 2 There are two different types of users for Alchemy. The case class ShallowDSL(types: List[ShallowType]) first type is a DSL developer, who is the end-user ofthe case class ShallowType(tpe: Type, methods: List[ShallowMethod]) { Alchemy framework for defining a new DSL together with def annotations: List[Annotation] a set of domain-specific optimizations specified by a set of def reflectType: Option[Type] annotations. A DSL developer is a domain expert, without } case class ShallowMethod(sym: MethodSymbol, too much expertise in compilers. body: Option[Tree]) { The second type of users is a compiler expert, who is not def annotations: List[Annotation] necessarily knowledgeable in various domains; instead, she def paramAnnots: List[(Int, Annotation)] } is an expert in building optimizing compilers. In particular, a compiler expert has detailed knowledge about the internals trait AlchemyCompiler { type DSLContext of a specific extensible optimizing compiler. Thus, she can def liftType(: Type)( implicit ctx: DSLContext): Type use the API provided by the Alchemy framework to specify def liftExp(e: Tree)( implicit ctx: DSLContext): Tree how the definition of an annotated Scala library is converted def compileDSL(dsl: ShallowDSL) ( implicit ctx: DSLContext): Tree into the boilerplate code required for a DSL compiler built def compileType(t: ShallowType) on top of an extensible optimizing compiler. Furthermore, ( implicit ctx: DSLContext): Tree she can extend the set of existing annotations provided by def compileMethod(m: ShallowMethod) ( implicit ctx: DSLContext): Tree Alchemy, for encoding the domain knowledge to be used by } an optimizing compiler.

Figure 2. The API of Alchemy for compiler experts. 4 Compiler-Compilation In this section, we give more details on the process of generat- ing a DSL compiler. First, we present the annotations defined A compiler expert extends the API exposed by Alchemy to by the Alchemy framework. Then, we show the process of implement the desired behavior (cf. Figure 2). gathering the DSL information from an annotated library. Afterwards, through an example we give more details on the 4.2 Gathering DSL Information process of generating a DSL compiler based on the gathered The Alchemy framework inspects the Scala AST of the given DSL information. Then, we show how Alchemy uses the annotated library after the type checking phase of the Scala implementation body of the annotated library for building compiler. Based on the typed Scala AST, Alchemy produces DSL compilers. Finally, we show the process of generating the information about the shallow version of the EDSL by EDSL compilers using a well-known embedding technique building ShallowMethod, ShallowType, and ShallowDSL ob- through our running example. jects, corresponding to the DSL methods, DSL types, and the whole DSL, respectively. A ShallowMethod instance has the symbol of the DSL 4.1 Alchemy Annotations method (the sym parameter) and the AST of its body, if avail- Deep Types. The DSL developers use the @deep annotation able. Also, this instance returns the list of annotations that for specifying the types for which they are interested in gen- the DSL developer has used for the method (annotations) erating a corresponding deep embedding. In other words, and its parameters (paramAnnots). this annotation should be used for the types that are actually A ShallowType instance contains the information of the participating in the definition of a DSL, rather than helper DSL type (the tpe parameter) and the list of its methods. In classes which are used for debugging, profiling, and logging addition, this instance has the list of annotations used for the purposes. type (annotations) and the type it reflectsreflectType ( ) in the case where it is annotated with @reflect. Reflected Types. The @reflect annotation is used for an- Finally, a ShallowDSL instance has the information of all notating the classes the of which the DSL devel- DSL types that are annotated with @deep. Next, we show opers have no access to. This annotation is used in Alchemy how this information is used to build a compiler for a simple for a) annotating the methods of the Scala core libraries, such DSL. as HashMap, ArrayBuffer, etc. which are frequently used, as well as for ) providing alternative implementations for the 4.3 Generating an EDSL Compiler DSL library and the Scala core library. Let us consider a DSL for working with complex numbers as our running example. For this DSL, we generate a DSL User-Defined Annotations. Alchemy allows compiler ex- compiler using a simple form of expression terms as the perts to define their custom annotations, together with the intermediate representation, which is used by compilation behavior of the target DSL compiler for the annotated method. frameworks such as Kiama [37]. 3 @deep @deep class Complex (val re: Double , val im: Double ) class Complex (val re: Double , val im: Double) { object Complex { @name("ComplexAdd") def add(c1: Complex, c2: Complex): Complex = def +(c2: Complex): Complex = new Complex(c1.re + c2.re, c1.im + c2.im) new Complex (this .re + c2.re , this .im + c2.im) def sub(c1: Complex, c2: Complex): Complex = @name("ComplexSub") new Complex(c1.re - c2.re, c1.im - c2.im) def -(c2: Complex): Complex = def zero(): Complex = new Complex (this .re - c2.re , this .im - c2.im) new Complex(0, 0) } } object Complex { def zero(): Complex = new Complex(0, 0) Figure 3. The annotated complex DSL implementation. }

// Predefined bya compiler expert Figure 5. The second version of the annotated Complex DSL trait Exp implementation. case class DoubleConstant(v: Double) extends Exp // Automatically generated by Alchemy case class ComplexNew(re: Exp, im: Exp) extends Exp // Shallow expression case class ComplexAdd(c1: Exp, c2: Exp) extends Exp new Complex(2, 3) - Complex.zero() case class ComplexSub(c1: Exp, c2: Exp) extends Exp // Lifted expression case class ComplexZero() extends Exp ComplexSub( ComplexNew( DoubleConstant(2), DoubleConstant(3) Figure 4. The generated IR nodes for the Complex DSL. ), ComplexZero() )

Figure 3 shows the implementation of this EDSL as an Figure 6. An example expression and its lifted version in annotated Scala library. This implementation can be used Complex DSL. as a normal Scala library to benefit from all the tool-chains provided for Scala such as debugging tools and IDEs. Figure 4 shows the definition of IR nodes generated by body of DSL library methods. These two methods are useful Alchemy. The IR nodes are algebraic types (ADTs), each for defining constructs for a DSL (i.e., the one specifying a different construct of the Complex DSL. For DSL constructs that do not have an actual node in the com- each method of the Complex companion object, Alchemy piler, instead they are defined in terms of other constructs generates a case class with a default naming scheme in of the DSL). An example of such a construct can be found in which the name of the object is followed by the name of Section 4.5. the method. For example, the method add of the Complex In addition, by providing several reflected versions (cf. object is converted to the ComplexAdd case class. As another Section 4.1) for a particular type, each one with a different example, the constructor of the Complex class is converted implementation, Alchemy can generate several transforma- to the ComplexNew case class. Each case class has the same tions for those DSL constructs. This removes the need to number of arguments as the corresponding shallow method. implement a DSL IR transformer, which manipulates the IR The methods of a class are converted in a similar manner. defined in the underlying optimizing compiler. The key difference is that the generated case class hasan To specify the way that expressions should be transformed, additional argument corresponding to this object. For ex- compiler experts can implement a Scala AST to Scala AST ample, the method + of the Complex class is converted to a transformation (cf. the liftExp method in Figure 2). Note case class with two parameters, where the first parameter that implementing Scala AST to Scala AST transformations corresponds to this object of the Complex class, and the from scratch can be a tedious and time-consuming task. Al- second parameter corresponds to the input parameter of the ternatively, if the target optimizing compiler uses the tagless + method. final3 [ ] or polymorphic embedding [13] approaches, one can As explained before, Alchemy allows a compiler expert use frameworks such as Yin-Yang [18], which are already pro- to define user-defined annotations. In Figure 5,the @name viding the translation required for these approaches. Next, annotation is used for overriding the default naming scheme we show a DSL compiler generated based on the polymor- provided by Alchemy. For example, the + method is con- phic embedding approach. verted to the ComplexAdd case class. 4.5 Generating a Polymorphic EDSL Compiler 4.4 Lifting the Implementation Let us consider the third version of the Complex DSL, shown As Figure 2 shows, Alchemy also provides two methods for in Figure 7. This version has an additional construct for negat- lifting the expression and the type of the implementation ing a complex number, specified by the unary_- method. 4 @deep // Predefined bya compiler expert class Complex (val re: Double , val im: Double) { trait Exp[T] @name("ComplexAdd") case class DoubleConstant(: Double) extends Exp[Double] def +(c2: Complex): Complex = // Automatically generated by Alchemy new Complex (this .re + c2.re , this .im + c2.im) case class ComplexNew(re: Exp[Double], @name("ComplexNeg") im: Exp[Double]) extends Exp[Complex] def unary_-(): Complex = case class ComplexAdd(self: Exp[Complex], new Complex (-this .re , -this .im) c2: Exp[Complex]) extends Exp[Complex] @sugar case class ComplexNeg(self:Exp[Complex]) extends Exp[Complex] def -(c2: Complex): Complex = case class ComplexZero() extends Exp[Complex] this + (-c2) } trait ComplexExp extends ComplexOps { object Complex { type Rep[T] = Exp[T] def zero(): Complex = def doubleConst(d: Double): Rep[Double] = new Complex(0, 0) DoubleConstant(d) } def complexAdd(self: Rep[Complex], c2: Rep[Complex]): Rep[Complex] = ComplexAdd(self, c2) def complexNeg(self: Rep[Complex]): Rep[Complex] = Figure 7. The third version of the annotated Complex DSL ComplexNeg(self) implementation. def complexSub(self: Rep[Complex], c2: Rep[Complex]): Rep[Complex] = complexAdd(self, complexNeg(c2)) def complexZero(): Rep[Complex] = trait ComplexOps { ComplexZero() type Rep[T] def complexNew(re: Rep[Double], def doubleConst(d: Double): Rep[Double] im: Rep[Double]): Rep[Complex] = def complexAdd(self: Rep[Complex], ComplexNew(re, im) c2: Rep[Complex]): Rep[Complex] } def complexNeg(self: Rep[Complex]): Rep[Complex] def complexSub(self: Rep[Complex], c2: Rep[Complex]): Rep[Complex] Figure 9. The generated IR node definitions and deep em- def complexZero(): Rep[Complex] def complexNew(re: Rep[Double], bedding interface for the Complex DSL. im: Rep[Double]): Rep[Complex] }

Figure 8. The generated polymorphic embedding interface for it. Instead, the complexSub method results in the invoca- for the Complex DSL. tion of the complexAdd and complexNeg methods, which is generated using the liftExp method of Alchemy. Figure 10 shows the lifted expression of the example of Subtracting two complex numbers is a syntactic sugar (an- Figure 6. In this case, instead of converting expressions to notated with @sugar) for adding the first complex number their ADT definition, Alchemy converts them to their corre- with the negation of the second complex number. sponding DSL method definition in polymorphic embedding. Polymorphic embedding [13] (or tagless final3 [ ]), is an In addition, this figure shows the generated IR nodes for this approach for implementing EDSLs where every DSL con- program, in which the subtraction construct is desugared struct is converted into a function (rather than an ADT) and into the addition and negation nodes. Note that the negation the interpretation of these functions are left abstract. Thus, of zero and addition with zero can be further simplified by it is possible to provide such abstract interpretations with providing yet another optimized interface implementation in different instances, such as actual evaluation, compilation, polymorphic embedding. Examples of such simplifications and partial evaluation [3, 13]. are given later in Section 5.4. Figure 8 shows the polymorphic embedding interface gen- Up to now, we have used simple expression terms for the erated by Alchemy for the third version of the Complex DSL. definition of IR nodes. Alchemy can easily generate other The type member Rep[T] is an abstract type representation types of IR nodes such as A-Normal Form [11], where the for different interpretations of Complex DSL programs. children of a node are either constant values or variable Figure 9 shows the generated deep embedding interface accesses. This means that all non-trivial sub-expressions are for the polymorphic embedding of the Complex DSL. Instead let-bound, which helps in applying optimizations such as of using ADTs for defining IR nodes, this time we use gener- common-subexpression elimination (CSE) and dead-code alized algebraic data types (GADTs). The invocation of each elimination (CSE). Such normalized types of IR nodes are DSL construct method results in the creation of the corre- used in various optimizing compilers such as Graal [41], sponding node. As the subtraction of two complex numbers LMS [30], Squid [26], and SC. We will see more detailed is a syntactic sugar, no corresponding IR node is created examples of SC in the next section. 5 // lifted expression in polymorphic embedding embedding approach [13] for deeply embedding DSLs, SC complexSub( uses Yin-Yang [18]2 which applies several transformations complexNew( doubleConst(2), doubleConst(3) (e.g., language virtualization [4]) in order to convert the plain ), complexZero() Scala code into the corresponding IR. ) We have used SC to build two different compilation-based // generatedIR nodes ComplexAdd( query engines: a) an analytical query processing engine [35, ComplexNew( 36], and b) a transactional query processing engine [6]. DoubleConstant(2), DoubleConstant(3) From the perspective of the abstraction level of a pro- ), ComplexNeg( ComplexZero() gram, the transformations are classified into two categories. ) First, optimizing transformations transform a program into ) another program on the same level of abstraction. Second, lowering transformations convert a program into one on a Figure 10. Polymorphic embedding version of the example lower abstraction level. SC provides a set of built-in trans- in Figure 6, and the generated IR nodes. formations out-of-the-box. These mainly consist of generic compiler optimizations such as common-subexpression elim-

Alchemy ination (CSE), dead-code elimination (DCE), partial evalua- Annotated tion (PE), etc. Shallow EDSL SC Plugin The last phase in the SC compiler is code generation where Deep Compilation EDSL the compiler generates the code based on the desired target SC DSL Developer Pipeline language. Observe that since each lowering transformation

Domain-Specific brings the program closer to the final target code, this pro- Optimizations vides the excellent property that code generation (e.g., code generation) in the end basically becomes a trivial and Figure 11. Overall design of SC used with Alchemy. naïve stringification of the lowest level representation. For converting from host to target languages, SC can use of the same infrastructure. To do this conversion, a DSL 5 Use Cases and Evaluation developer only has to express the constructs and the neces- In this section, we present the use cases built on top of sary data-structure API of the target language as a library Alchemy. We provide an extended set of annotations and inside the host language. Then, there is no need for the DSL show their usage. Finally, we evaluate the productivity of developer to manually provide code generation for the target the DSL developer. language using internal compilers as is the case with most existing solutions. In contrast, Alchemy automatically 5.1 SC generates the transformation phases needed to convert from SC (the Systems Compiler) is a compilation framework for host language IR nodes to target language IR nodes (e.g., building compilation-based systems in the Scala program- from Scala to C). ming language. Different system component libraries can be An important side-effect of our design is that since the considered as different DSLs, for which system developers plain Scala code of a system does not require any specific extend SC to build DSL compilers. To hide the internal im- syntax, type or IR-related information from SC, this code plementation details of the compiler, Alchemy provides an is directly using the normal Scala compiler. In between the system component libraries this case, the Scala compiler will ignore all Alchemy anno- and the SC optimizing compiler itself. Figure 11 shows the tations, and interpret the code of the system using plain overall design of Alchemy and SC, which operates as follows. Scala. Alchemy can thus be seen as a system for converting The system developer (who is actually a DSL developer) a system interpreter (which executes the systems code unop- uses the SC plugin of Alchemy to create a DSL compiler. timized) into the corresponding system compiler along with Many systems optimizations are automatically converted by its optimizations. Alchemy to functions that manipulate the IR of the compiler. Next, we briefly provide more details about the two cate- The system developer uses a set of annotations provided by gories of transformations that SC supports. the compiler expert of the SC framework, to specify the IR transformations. To provide more advanced domain-specific optimizations that cannot be encoded by annotations, as well as compilation phases, the system developer uses the transformation API provided by SC. 2We note that Yin-Yang, in contrast to our work, handles only the conversion SC converts the systems code to a graph-like interme- from plain Scala code to IR, without providing any functionality related to diate representation (IR). As SC follows the polymorphic code optimization of the systems library. 6 class MyTransformer extends RuleBasedTransformer { @deep += rule { case Pattern => @inline // gather necessary information for analysis abstract class Operator[A] { } abstract def init(): Unit rewrite += rule { case Pattern => } /* use analysis information while generating the appropriate transformed node*/ @deep } @inline rewrite += remove { case Pattern => class ScanOp[A](table: Array[A]) extends Operator[A]{ /* use analysis information to remove node*/ var i = 0 } @inline } def init () = { while (i < table.length) { child.consume(table(i)) Figure 12. Offline Transformation API of SC. i += 1 } } }

@deep 5.2 SC Transformations @inline SC classifies the transformations into two categories, which class HashJoinOp[A,B,C]( val leftParent: Operator[A], val rightParent: Operator[B], ...) extends we present in more detail next while also highlighting differ- Operator[CompositeRecord[A, B]] { ences from previous work in each class. @inline var mode = 0 @inline def init () = { Online transformations are applied while the IR nodes are mode = 0 generated. Every construct of a DSL is mapped to a method // phase 1: leftParent will call this.consume leftParent.init() invocation, which in turn results in the generation of an mode = 1 IR node [3, 13]. By overriding the behavior of that method, // phase 2: rightParent will call this.consume an online transformation can result in the generation of a rightParent.init() mode = 2 different (set of) IR node(s) than the original IR node. Even } though a large set of optimizations (such as constant fold- @inline ing, common subexpression elimination, and others) can be def consume(tuple: Record): Unit = { if ( mode == 0) { expressed using online transformations, some optimizations /*phase1-- elided code for left side of join*/ need to be preceded by analysis over the whole program. } else if ( mode == 1) { /*phase2-- elided code for right side of join*/ For a restricted set of control-flow constructs, namely } structured loops, it is possible to use the Speculative Rewrit- } ing [22] approach in order to combine the data-flow analysis } with an online transformation, thus bypassing the need for a separate analysis pass. However, we have observed that there Figure 13. Inline annotations of two operators in our ana- exists an important class of transformations in which the lytical query engine. corresponding analysis cannot be combined with the trans- formation phase. This class of optimizations, which cannot be handled by existing extensible optimizing compilers, is 5.3 SC Annotations presented next. In this section, we present in more detail the different cate- gories of annotations implemented for SC. Offline transformations need whole be- Side-Effects. These are annotations that guide the effect fore applying any transformation. Figure 12 shows the SC system of the optimizing compiler. For example, a method an- offline transformation API. The analysis construct specifies notated with @pure denotes that this method does not cause the information that should be collected during the analysis any side effects and the expressions that call this method phase of a transformation. The rewrite construct specifies can be moved freely throughout the program. In addition, the transformation rules based on the information gathered Alchemy provides more fine-grained effect annotations that during the analysis phase. Finally, the remove construct re- keep track of read and write mutations of objects. More moves the pattern specified in its body. precisely, if a method is annotated with read or write anno- The Alchemy annotation processor takes care of convert- tations, then there exists a mutation effect over the specific ing the Scala annotations of the systems library, which ex- object (i.e., this) of that particular class. Similarly, an anno- press optimizations, into IR transformers which manipulate tated argument may include read or write effects over that the intermediate representation of SC. This is explained in argument. more detail in Section 5.4.

7 Inline. The @inline annotation guides the inlining deci- the previous expression. Then, it becomes possible to apply sions of the compiler. This annotation can be applied to and get the expression 3 + a. methods, whole classes as well as class fields, with differ- ent semantics in each case. Methods annotated with the @inline annotation specify that every invocation of that 5.4 Generating Transformation Passes method should be inlined by the compiler. For classes, the As discussed in Section 5.2, these transformation passes are @inline annotation removes the abstraction of the specific classified into two categories: online and offline transforma- class during compilation time. In essence, this means that tions. In this section, we demonstrate how Alchemy gener- the methods of an inlined class are implicitly annotated with ates online and offline transformation passes. the inline annotation and are subsequently inlined. This makes inlined classes in Alchemy semantically similar to Generating Online Transformations. In general, Alchemy value classes [32]. Finally, a mutable field of a class can also uses node generation (online transformation) in order to im- be annotated with @inline, which means that all the usages plement the appropriate rewrite rules for most annotations. of this field are partially evaluated during compilation time. As we discussed in Section 4.5, every construct of a DSL is Figure 13 shows the scan and hash-join operators anno- mapped to a method invocation, which in turn results in the tated with the @inline annotation. In this example, all meth- generation of an IR node [3, 13]. ods of the HashJoinOp class are automatically inlined, as For example, in the case of addition on natural numbers, the HashJoinOp class is marked with the @inline annota- the default behavior for the method int_plus is shown in tion. Furthermore, the mutable field mode is partially evalu- lines 1-3 of Figure 15. This method generates the IntPlus ated at compilation time and, as a result, the corresponding IR node, which is also automatically generated by Alchemy. branch in the consume method is also partially evaluated at However, when this method is annotated with the @monoid compilation time. More concretely, both leftParent.init and @commutative annotations, this results in the genera- and rightParent.init invoke the consume method of the tion of an online transformation. More specifically, the an- HashJoinOp class. However, the former inlines the code in notated method automatically generates the code shown in the phase1 block whereas the latter inlines the phase2 code lines 5-14 of the same figure. First, as the method is pure, block. This is possible as mode is evaluated during compila- SC checks if both arguments are statically known. This is tion time and, thus, there is no need to generate any code for achieved by checking if the expressions are of Constant it and the corresponding if condition checks. We have found type or not. In this case, SC performs partial evaluation by that there are multiple examples where such if conditions computing the result through the addition of the arguments. can be safely removed in our analytical query engine (e.g., in Second, if only one of the arguments is statically known and the case of configuration variables whose values are known it is equal to 0, the monoid property of this operator returns in advance at startup time). the dynamic operand. Third, if only one of the arguments is statically known (but it is not zero), then the static argument Algebraic Structure. These are annotations for specifying is pushed as the left operand, as we know that this operator the common algebraic rules that occur frequently for dif- is commutative. Finally, if none of the previous cases is true, ferent use cases. For example, @monoid specifies a binary then the default behavior is used and the original IntPlus operation of a type that has a monoid structure. In the case IR node is generated. of natural numbers, @monoid(0) over the + operator repre- Alchemy also generates an online transformation out of sents that a+0=0+a=a. The annotation processor generates the @inline annotation. For methods with this annotation, several constant folding optimizations which benefit from instead of generating the corresponding node, Alchemy gen- such algebraic structure and significantly improve the per- erates the nodes for the body of that method. In the special formance of systems that use them. case of dynamic dispatch, the concrete type of the object Furthermore, the @commutative annotation specifies that is looked up and based on its value Alchemy invokes the the order of the operands of a binary operation can be changed appropriate method. without affecting the result. This property is useful forap- For example, the annotated code for the scanning operator plying constant folding on cases in which static arguments of the analytical query engine, shown in Figure 13, generates and dynamic arguments are mixed in an arbitrary order, the compiler code shown in Figure 16. There the scanOpInit thus hindering the constant folding process. For example, in method represents the corresponding method which is in- the expression 1 + a + 2, constant folding cannot be per- voked in order to generate an appropriate IR. As is the case formed without specifying that the commutativity property with integer addition, the default behavior of this method, of addition on natural numbers is applicable in this case. which results in creating the ScanOpInit IR node, is shown However, if we push the static terms to the left side of the in lines 1-3. The rest of the code presents the implemen- expression while we generate the nodes, we generate the tation of the @inline annotation for this operator, which IR which represents the expression 1 + 2 + a instead of results in inlining the body of this method while generating 8 @deep trait OperatorOps extends Base { @reflect[Int] def operatorInit[A:Type](self:Rep[Operator[A]]):Rep[Unit] class AnnotatedInt { } @commutative @monoid (0) trait ScanOpOps extends OperatorOps { @pure def scanOpInit[A:Type](self: Rep[ScanOp[A]]): Rep[Unit] def +(x: Int): Int } } trait OperatorExp extends OperatorOps with BaseExp { def operatorInit[A:Type](self: Exp[Operator[A]]) = Int OperatorInit(self) Figure 14. Alchemy annotations of the class. The } AnnotatedInt class is a mirror class for the original Int Scala class. trait ScanOpExp extends ScanOpOps with OperatorExp { // the default behavior of scanOp.init operation def scanOpInit[A:Type](self: Exp[ScanOp[A]]) = trait Base { ScanOpInit(self) type Rep[T] } } trait ScanOpInline extends ScanOpExp { trait BaseExp extends Base { // the inlined behavior of scanOp.init operation type Rep[T] = Exp[T] override } def scanOpInit[A:Type](self: Exp[ScanOp[A]]) = __whileDo(self.i < (self.table.length), { trait IntOps extends Base { self.child.consume(self.table.apply(self.i)) def int_plus(a: Rep[Int], b: Rep[Int]): Rep[Int] self.i = self.i + unit(1) } }) // handling of dynamic dispatch for operator.init trait IntExp extends IntOps with BaseExp { override // defaultIR generation def operatorInit[A:Type](self: Exp[Operator[A]]) = def int_plus(a: Exp[Int], b: Exp[Int]): Exp[Int] = self .tpe match { IntPlus(a, b) case ScanOpType(_) => } scanOpInit(self.asInstanceOf[Exp[ScanOp[A]]]) // the rest of the operator types... trait IntExpOpt extends IntExp { } // optimizedIR generation } override def int_plus(a: Exp[Int], b: Exp[Int]): Exp[Int] = (a, b) match { Figure 16. The generated online transformations by case (Constant(aStatic), Constant(bStatic)) => Constant(aStatic + bStatic) Alchemy for the scan operator of the analytical query engine. case (Constant(0), bDynamic) => bDynamic case (aDynamic, Constant(0)) => aDynamic case (aDynamic,Constant(bStatic)) => int_plus(b,a) patterns that require abstraction overheads such as generic case (_, _) => super .int_plus(a,b) } programming [20, 42]. } Generating Offline Transformations. The generated trans- formations are not limited to online transformations. Alchemy also generates offline transformation passes. Figure 17 shows Figure 15. The generated online transformation by Alchemy the implementation of three different transformations for for addition on Int. the Seq class3, in plain Scala code. The first implementation uses a linked list for storing the elements of the sequence. the IR node. The method scanOpInit is automatically gen- The second implementation stores the elements in an ar- erated by Alchemy which generates the body of the init ray data-structure.4 Finally, the third implementation uses method. As described earlier, all method invocations lead to a g_list data-structure, provided by GLib. The generated the generation of the corresponding IR nodes. For example, transformation from this class can be used for using data __whileDo results in creating an IR node for a while loop. structures provided by GLib in the generated C code. Finally, for inlining the init method of the Operator class, These implementations can be used for debugging the we need to handle dynamic dispatch, as we described earlier. correctness of the transformers. For using them in the DSL We do so by redirecting to the appropriate method based compiler, Alchemy generates offline transformations based on the type of the caller object. An alternative design is to use multi-stage programming for encoding the fact that the 3By a Seq , we mean a collection where the order of its elements objects of Operator class are staged away. This is achieved does not matter. 4This implementation assumes that the number of the elements in the by generating the deep embedding interface of all operator collection does not exceed MAX_BUCKETS. In cases where this assumption classes as partially static. With a similar design, one can sup- does not hold, one has to make the corresponding field mutable, and add port staging for other libraries implemented using design an additional check while inserting an element. 9 @offline @offline @offline @reflect[Seq[_]] @reflect[Seq[_]] @reflect[Seq[_]] class SeqLinkedList[T] { class SeqArray[T: Manifest] { class SeqGlib[T] { var head: Cont[T] = null val array = var gHead: Pointer[GList[T]] = null new Array[T](MAX_BUCKETS) var size : Int = 0 def +=(elem: T) = def +=(elem: T) = { def +=(x: T) = head = Cont(elem, head) array(size) = elem gHead = size += 1 g_list_append(gHead, &(x)) } def foreach(f: T => Unit) = { def foreach(f: T=>Unit) = def foreach(f: T=>Unit) = { var current = head for (i <- 0 until size) { var current = gHead while ( current != null ){ val elem = array(i) while (current != NULL) { f(current.elem) f( elem ) f(*(current.data)) current = current.next } current = g_list_next(current) } } } } } } }

Figure 17. Different transformations for the Scala Seq class. The transformations are written using plain Scalacode.

class SeqArrayTransformer extends RuleBasedTransformer{ details about the implementation of quasi-quotations and rewrite += rule { case SeqNew[T]() => their usages are beyond the scope of this paper. val _maxSize = ("maxSize", true, unit(0)) val _array = ("array", false, __newArray[T](MAX_BUCKETS)) The aforementioned design provides several advantages record[Seq[T]](_maxSize, _array) over previous work. First, the Alchemy annotation processor } uses Scala annotations. This means that there is no need rewrite += rule { case SetPlusEq[T](self, elem) => self.array.update(self.maxSize, elem) to provide specific infrastructure for an external DSL, as self.maxSize_=(self.maxSize.+(unit(1))) opposed to the approach of Stratego/Spoofax [19]. Second, } developers can annotate the source code with appropriate // Provides access to the fields of the // generated record for Seq annotations, without the need to port it into another DSL, implicit class SeqArrayOps[T](self: Rep[Seq[T]]) { as opposed to the approach taken in Forge [38]. In other def maxSize_=(x: Rep[Int]): Rep[Unit] = fieldSetter(self, "maxSize", x) words, developers use the signature of classes and methods def maxSize: Rep[Int] = as the meta-data needed for specifying the DSL constructs, fieldGetter[Int](self, "maxSize") whereas in a system like Forge the DSL developer must use def array: Rep[Array[T]] = field[Array[T]](self, "array") Forge DSL constructs to specify the constructs of the DSL. } Third, as we aim to give systems developers the ability to } write their systems in plain Scala code, we designed Alchemy so that developers can place the annotations on the systems Figure 18. The generated offline transformations by code itself, whereas an approach like Truffle [15] focuses on Alchemy for Seq based on arrays. self-optimizing AST interpreters. Thus, the latter annotates the AST nodes of the language itself.

5.5 Productivity Evaluation on the SC API (cf. Figure 12). Figure 18 shows the generated offline transformation for the implementation of the Seq We use Alchemy to automatically generate the compiler data-structure using an array. This transformation lowers the interface for a subset of the standard Scala library and two objects of a Seq into records with two fields: 1) database engines: 1) an analytical query engine [35, 36], and 2) a transactional query engine [6]. Table 1 compares the the underlying array, 2) the current size of the collection. The 5 nodes corresponding to each method of this data structure number of LoCs of the library classes with the generated are then rewritten to the IR nodes of the implementation compiler interfaces. We make the following observations. body provided in the reflected type. First, for the Scala standard library classes, the LoCs of the Many offline transformations require inspecting the gen- reflected classes are mentioned in the table. These classes erated IR nodes to check their applicability. In some of these provide the method signatures of their original classes and cases, compiler experts can provide annotations to generate are annotated with appropriate effect and algebraic structure the required analysis passes. However, in many cases, the annotations (Section 5.3). However, in most cases, developers analysis requires more features than the ones provided by do not need to provide the implementation of the methods of the existing annotations. Implementing such analysis passes these classes. As a result, the compiler interfaces of the Scala can be facilitated by using quasi-quotations [26, 27, 34]. More 5We used CLOC [5] to compare the number of LoCs. 10 Type Library Compiler 6 Conclusions Analytical Query Engine In this paper, we have presented Alchemy, a compiler gen- Query Operators 541 3456 erator for DSLs embedded in Scala. Alchemy automatically Monadic Interface 156 407 generates the boilerplate code necessary for building a DSL File Manager 254 291 compiler using the infrastructure provided by existing exten- Aux. Classes 100 749 sible optimizing compilers. Furthermore, Alchemy provides Transactional Query Engine an extensible set of annotations for encoding domain-specific In-Memory Storage 45 294 knowledge in different optimizing compilers. Finally, we Indexing Data-Structures 69 394 have shown how to extend the Alchemy annotations to gen- Aux. Classes 58 364 erate the boilerplate code required for building two different query compilers on top of an extensible optimizing compiler. Scala Library Boolean 18 255 Int 85 970 References Seq 39 334 [1] Emil Axelsson, Koen Claessen, Gergely Dévai, Zoltán Horváth, Karin Seq Trans. 176 329 Keijzer, Bo Lyckegård, Anders Persson, Mary Sheeran, Josef Svennings- Array 39 306 son, and András Vajda. 2010. Feldspar: A domain specific language ArrayBuffer 52 453 for digital signal processing algorithms. In IEEE/ACM International HashMap 32 259 Conference on and Models for Codesign (MEMOCODE). IEEE, 169–178. HashMap Trans. 162 305 [2] Erwin Book, Dewey Val Shorre, and Steven J. Sherman. 1970. The C GLib 181 729 CWIC/36O System, a Compiler for Writing and Implementing Com- Other Classes 936 7007 pilers. SIGPLAN Notices 5, 6 (June 1970), 11–29. [3] Jacques Carette, Oleg Kiselyov, and Chung-Chieh Shan. 2009. Finally Total 2943 16902 tagless, partially evaluated: Tagless staged interpreters for simpler typed languages. Journal of 19, 05 (2009), Table 1. The comparison of LoCs of the (reflected) classes 509–543. [4] Hassan Chafi, Zach DeVito, Adriaan Moors, Tiark Rompf, Arvind of the Scala standard library and a preliminary implementa- Sujeeth, Pat Hanrahan, Martin Odersky, and Kunle Olukotun. 2010. tion of two query engines together with the corresponding Language Virtualization for Heterogeneous . SIG- automatically generated compilation interface. PLAN Notices 45, 10 (Oct. 2010), 835–847. [5] Al Danial. 2018. Cloc–count lines of code. http://cloc.sourceforge.net/.. (2018). Accessed: 2018-07-03. [6] Mohammad Dashti, Sachin Basil John, Thierry Coppey, Amir Shaikhha, Vojin Jovanovic, and Christoph Koch. 2018. Compiling Database Ap- plication Programs. CoRR abs/1807.09887 (2018). arXiv:1807.09887 http://arxiv.org/abs/1807.09887 standard library classes can be generated with only tens of [7] Sebastian Erdweg, Tillmann Rendel, Christian Kästner, and Klaus Os- termann. 2011. SugarJ: Library-based Syntactic Language Extensibility. LoCs. The exception is the reflected classes responsible for In Proceedings of the 2011 ACM International Conference on Object Ori- generating offline transformations (e.g., Seq Transformation ented Programming Systems Languages and Applications (OOPSLA’11). and HashMap Transformation), where the developer pro- ACM, New York, NY, USA, 391–406. vides the implementation to which every method should be [8] Sebastian Erdweg and Felix Rieger. 2013. A Framework for Extensible transformed into. Languages. In Proceedings of the 12th International Conference on Gen- Int erative Programming: Concepts & Experiences (GPCE’13). ACM, New Furthermore, observe that the class contains more York, NY, USA, 3–12. LoCs than the many other standard library classes. This is [9] Sebastian Erdweg, Tijs Van Der Storm, Markus Völter, Meinte Boersma, because each operation of this class encodes different com- Remi Bosman, William R Cook, Albert Gerritsen, Angelo Hulshout, binations of arguments in its methods with other numeric Steven Kelly, Alex Loh, et al. 2013. The state of the art in language classes (e.g., Int with Double, Int with Float, and so on). Fur- workbenches. In International Conference on Software Language Engi- neering. Springer, 197–217. thermore, the generated compiler interface of this class is [10] Rickard E. Faith, Lars S. Nyland, and Jan F. Prins. 1997. KHEPERA: A also longer than expected. This is because the generated System for Rapid Implementation of Domain Specific Languages. In compiler code contains the constant-folding optimization Proceedings of the 1997 Conference on Domain-Specific Languages (DSL’ (Section 5.3), which is encoded by Alchemy annotations. In 97), Vol. 97. USENIX Association, Berkeley, CA, USA, 19–19. addition, for the query operators of the analytical query en- [11] Cormac Flanagan, Amr Sabry, Bruce F Duba, and Matthias Felleisen. 1993. The Essence of Compiling with Continuations. In ACM SIGPLAN gine, the generated compiler interface encodes all online Notices, Vol. 28. ACM, 237–247. partial evaluation processes annotated using the @inline [12] Martin Fowler. 2005. Language workbenches: The killer-app annotation. This results in the partial evaluation of mutable for domain specific languages. http://martinfowler.com/articles/ fields, function inlining, and virtual dispatch removal. languageWorkbench.html. (2005). Accessed: 2018-07-03.

11 [13] Christian Hofer, Klaus Ostermann, Tillmann Rendel, and Adriaan [28] Markus Puschel, José M. F. Moura, Jeremy R. Johnson, David Padua, Moors. 2008. Polymorphic embedding of DSLs. In Proceedings of the 7th Manuela M. Veloso, Bryan W. Singer, Jianxin Xiong, Franz Franchetti, International Conference on Generative Programming and Component Aca Gacic, Yevgen Voronenko, Kang Chen, Robert W. Johnson, and Engineering (GPCE’08). ACM, 137–148. Nicholas Rizzolo. 2005. SPIRAL: Code generation for DSP transforms. [14] Paul Hudak. 1996. Building Domain-specific Embedded Languages. Proc. IEEE 93, 2 (2005), 232–275. ACM Comput. Surv. 28, 4es, Article 196 (Dec. 1996). [29] Jonathan Ragan-Kelley, Connelly Barnes, Andrew Adams, Sylvain [15] Christian Humer, Christian Wimmer, Christian Wirth, Andreas Wöß, Paris, Frédo Durand, and Saman Amarasinghe. 2013. Halide: A Lan- and Thomas Würthinger. 2014. A Domain-specific Language for Build- guage and Compiler for Optimizing Parallelism, Locality, and Recom- ing Self-Optimizing AST Interpreters. In Proceedings of the 2014 Interna- putation in Image Processing Pipelines. In Proceedings of the 34th ACM tional Conference on Generative Programming: Concepts and Experiences SIGPLAN Conference on Programming Language Design and Implemen- (GPCE 2014). ACM, New York, NY, USA, 123–132. tation (PLDI’13). ACM, New York, NY, USA, 519–530. [16] JetBrains. 2018. Meta Programming System. http://www.jetbrains. [30] Tiark Rompf and Martin Odersky. 2010. Lightweight Modular Stag- com/mps. (2018). Accessed: 2018-07-03. ing: A Pragmatic Approach to Runtime Code Generation and Com- [17] Stephen C. Johnson et al. 1975. Yacc: Yet another compiler-compiler. piled DSLs. In Generative Programming and Component Engineering Vol. 32. Bell Laboratories Murray Hill, NJ. (GPCE’10). ACM, New York, NY, USA, 127–136. [18] Vojin Jovanović, Amir Shaikhha, Sandro Stucki, Vladimir Nikolaev, [31] Tiark Rompf, Arvind K. Sujeeth, Nada Amin, Kevin J. Brown, Vojin Christoph Koch, and Martin Odersky. 2014. Yin-Yang: Concealing the Jovanovic, HyoukJoong Lee, Manohar Jonnalagedda, Kunle Olukotun, Deep Embedding of DSLs. In Proceedings of the 2014 International Con- and Martin Odersky. 2013. Optimizing Data Structures in High-level ference on Generative Programming: Concepts and Experiences (GPCE Programs: New Directions for Extensible Compilers Based on Staging. 2014). ACM, 73–82. In Proceedings of the 40th Annual ACM SIGPLAN-SIGACT Symposium [19] Lennart C. L. Kats and Eelco Visser. 2010. The spoofax language on Principles of Programming Languages (POPL’13). ACM, New York, workbench: rules for declarative specification of languages and IDEs. NY, USA, 497–510. In ACM SIGPLAN Notices, Vol. 45. ACM, 444–463. [32] John Rose. 2014. Value Types and Struct Tearing. (2014). https: [20] Ralf Lämmel and Simon Peyton Jones. 2005. Scrap Your Boilerplate //blogs.oracle.com/jrose/entry/value_types_and_struct_tearing. with Class: Extensible Generic Functions. In Proceedings of the Tenth [33] Maximilian Scherr and Shigeru Chiba. 2015. Almost First-class Lan- ACM SIGPLAN International Conference on Functional Programming guage Embedding: Taming Staged Embedded DSLs. In Proceedings of (ICFP’05). ACM, New York, NY, USA, 204–215. the 2015 ACM SIGPLAN International Conference on Generative Pro- [21] HyoukJoong Lee, Kevin J. Brown, Arvind K. Sujeeth, Hassan Chafi, gramming: Concepts and Experiences (GPCE 2015). ACM, New York, Tiark Rompf, Martin Odersky, and Kunle Olukotun. 2011. Implement- NY, USA, 21–30. ing Domain-Specific Languages for Heterogeneous Parallel Comput- [34] Denys Shabalin, Eugene Burmako, and Martin Odersky. 2013. ing. IEEE Micro 31, 5 (Sept. 2011), 42–53. Quasiquotes for scala. Technical Report. EPFL. [22] Sorin Lerner, David Grove, and Craig Chambers. 2002. Composing [35] Amir Shaikhha, Yannis Klonatos, and Christoph Koch. 2018. Building dataflow analyses and transformations. In ACM SIGPLAN Notices, Efficient Query Engines in a High-Level Language. ACM Transactions Vol. 37. ACM, 270–282. on Database Systems 43, 1, Article 4 (April 2018), 45 pages. [23] David H. Lorenz and Boaz Rosenan. 2011. Cedalion: A Language for [36] Amir Shaikhha, Yannis Klonatos, Lionel Parreaux, Lewis Brown, Mo- Language Oriented Programming. In Proceedings of the 2011 ACM hammad Dashti, and Christoph Koch. 2016. How to Architect a Query International Conference on Object Oriented Programming Systems Lan- Compiler. In Proceedings of the 2016 International Conference on Man- guages and Applications (OOPSLA’11). ACM, New York, NY, USA, 733– agement of Data (SIGMOD’16). ACM, New York, NY, USA, 1907–1922. 752. [37] Anthony M. Sloane. 2011. Lightweight Language Processing in Kiama. [24] Shayan Najd, Sam Lindley, Josef Svenningsson, and Philip Wadler. 2016. In Generative and Transformational Techniques in Everything Old is New Again: Quoted Domain-specific Languages. In III. Lecture Notes in Science, Vol. 6491. 408–425. Proceedings of the 2016 ACM SIGPLAN Workshop on Partial Evaluation [38] Arvind K Sujeeth, Austin Gibbons, Kevin J Brown, HyoukJoong Lee, and Program Manipulation (PEPM 2016). ACM, New York, NY, USA, Tiark Rompf, Martin Odersky, and Kunle Olukotun. 2013. Forge: Gen- 25–36. erating a High Performance DSL Implementation from a Declarative [25] Georg Ofenbeck, Tiark Rompf, Alen Stojanov, Martin Odersky, and Specification. In GPCE’13. ACM, ACM, New York, NY, USA, 145–154. Markus Püschel. 2013. Spiral in Scala: Towards the Systematic Con- [39] Eelco Visser. 2002. Meta-programming with Concrete Object Syntax. In struction of Generators for Performance Libraries. In Proceedings of Proceedings of the 1st ACM SIGPLAN/SIGSOFT Conference on Generative the 12th International Conference on Generative Programming: Concepts Programming and Component Engineering (GPCE’02). Springer-Verlag, & Experiences (GPCE’13). ACM, New York, NY, USA, 125–134. , UK, 299–315. [26] Lionel Parreaux, Amir Shaikhha, and Christoph E. Koch. 2017. Quoted [40] Martin P Ward. 1994. Language-oriented programming. Software- Staged Rewriting: A Practical Approach to Library-defined Optimiza- Concepts and Tools 15, 4 (1994), 147–161. tions. In Proceedings of the 16th ACM SIGPLAN International Conference [41] Thomas Würthinger. 2011. Extending the Graal Compiler to Optimize on Generative Programming: Concepts and Experiences (GPCE 2017). Libraries. In Companion to the 26th Annual ACM SIGPLAN Conference ACM, New York, NY, USA, 131–145. on Object-Oriented Programming, Systems, Languages, and Applications. [27] Lionel Parreaux, Antoine Voizard, Amir Shaikhha, and Christoph E. ACM, 41–42. Koch. 2017. Unifying Analytic and Statically-typed Quasiquotes. Proc. [42] Jeremy Yallop. 2017. Staged . Proceedings of the ACM Program. Lang. 2, POPL, Article 13 (Dec. 2017), 33 pages. ACM on Programming Languages 1, ICFP (2017), 29.

12