<<

First-Class Support for Resugaring in Rascal

Master’s Thesis

Wouter Nederhof

Supervisor: dr. Tijs van der Storm Centrum voor Wiskunde en Informatica

Faculty of Science University of Amsterdam The Netherlands January 5, 2016 Abstract In this Master’s thesis, we aimed to investigate how capable resugaring techniques are for real-world use. As part of our research, we created a prototype based on a literature study and improved this artefact using observations from different case studies. We started our research by studying two papers of J. Pombrio and S. Krishnamurthi on resugaring to build our initial prototype[1][2]. During our research, we observed that the techniques described in ”Lifting Evaluation Sequences through Syntactic Sugar” were neither efficient nor expressive enough for our case studies. The techniques described in the paper ”Hygienic resugaring of compositional desugaring” had sufficient performance capacities, but were too constrained for resugaring terms. Following the literature study, we integrated the techniques from these papers into Rascal. During the implementation, we met two notable problems. First, we observed that patterns containing ellipses can break symmetry. As a solution for this problem, we devised a technique in which the lengths of ellipses are fixed after a transformation. Second, we had to work around a constraint in the techniques of the latter paper, which required transformation functions to consume and produce patterns. We found this to be a major restriction, since Rascal did not have a pattern data type. This meant that we had to devise new techniques to obtain a similar level of expressivity. Although our prototype was based on techniques by J. Pombrio and S. Krishnamurthi, adapt- ing and extending their techniques to Rascal ultimately led to a significantly different design. In order to sustain our design decisions, we sketched how we could formally address most of them. Finally, we demonstrated that our finished prototype was capable of desugaring and resug- aring multiple different cases. Some of these cases could not be addressed using the original techniques and required a change in our design. We showed that our prototype was able to desugar and resugar different categories of syntactic sugar in ES6. Additionally, we were able to build an evaluation stepper that desugars a functional into the , step through the code and resugar intermediate results, including Church numerals. Lastly, we demonstrated that our prototype is capable of desugaring and resugar- ing terms over 50.000 characters in size in just over a second, and how the different techniques perform using a performance benchmark. Contents

1 Introduction 5 1.1 Context ...... 6 1.2 Motivation ...... 6 1.3 Research Questions ...... 6 1.4 Research Method ...... 7 1.5 Contributions ...... 7 1.6 Related Work ...... 7 1.7 Outline ...... 9

2 Background 10 2.1 Concrete and Abstract Syntax ...... 10 2.2 Pattern Matching and Substitution ...... 10 2.3 Rascal ...... 11 2.3.1 Static Typing ...... 11 2.3.2 Syntax Declaration and ADTs ...... 11 2.3.3 Functions ...... 12 2.3.4 Annotations ...... 13 2.4 Lambda Calculus ...... 13

3 Desugaring and Resugaring 15 3.1 Desugaring ...... 15 3.2 Benefits of Desugaring ...... 16 3.3 The Need for Resugaring ...... 17

4 Resugaring Techniques 19 4.1 Resugaring: Lifting Evaluation Sequences through Syntactic Sugar ...... 19 4.2 Hygienic resugaring of compositional desugaring ...... 21

5 Resugaring in Rascal 24 5.1 Desugaring and Resugaring ...... 24 5.2 Sugar Function Declarations ...... 24 5.2.1 Intermediate Sugar Function Declarations ...... 25 5.2.2 Compositional Sugar Function Declarations ...... 25 5.2.3 Custom Sugar Function Declarations ...... 26 5.3 Fixing the Lengths of Ellipses ...... 27 5.4 When-conditions ...... 27

1 5.5 Resugaring Fallback Mechanism ...... 28 5.6 Break-out functions ...... 29 5.7 Usage Examples ...... 29 5.7.1 Basic Usage ...... 30 5.7.2 Intermediate Versus Compositional Desugaring ...... 30 5.7.3 Desugaring Using Different Types ...... 31 5.7.4 When-Conditions ...... 32

6 Implementation and Observations 34 6.1 Initial Design and Implementation ...... 34 6.1.1 Evaluation ...... 35 6.1.2 Retrospect ...... 35 6.2 Compositional Desugaring ...... 37 6.2.1 Compositional Desugaring Transformation ...... 37 6.2.2 Limitations ...... 38 6.2.3 Desugaring Intermediate Core Terms ...... 39 6.3 Matching Ambiguity ...... 40 6.4 Sugar Contexts ...... 41 6.5 Beyond Substitution-Based Desugaring ...... 42 6.6 Fallback Resugaring Functions ...... 42 6.7 Annotations ...... 43 6.8 Node Identity ...... 44

7 Formal Justification Sketch 45 7.1 Properties ...... 45 7.2 Outline ...... 46 7.3 Formal Definition ...... 46 7.4 Emulation ...... 48 7.5 Abstraction ...... 52 7.6 Coverage ...... 53

8 Evaluation 54 8.1 Expressiveness Evaluation: Fun to Lambda Calculus ...... 54 8.1.1 Syntax Definition ...... 54 8.1.2 Church Numerals ...... 55 8.1.3 Matching-And-Substitution-Based Transformations ...... 56 8.1.4 Evaluator ...... 57 8.1.5 Usage Examples ...... 57 8.1.6 Conclusion ...... 60 8.2 Expressiveness Evaluation: ES6 to ES5 ...... 61 8.2.1 Arrows to Functions ...... 61 8.2.2 Binary and Octal Numbers to Decimal Numbers ...... 63 8.2.3 Rest and Keyword Parameters ...... 64 8.2.4 Conclusion ...... 66 8.3 Performance Benchmark ...... 66 8.3.1 Setup ...... 66 8.3.2 Results and Conclusion ...... 67

2 8.4 Performance Evaluation Case Study ...... 68 8.4.1 Banana Algebra Fun Language ...... 69 8.4.2 Setup ...... 69 8.4.3 Results and Conclusion ...... 69

9 Discussion 71 9.1 Research Questions ...... 71 9.1.1 Efficiency ...... 71 9.1.2 Expressivity ...... 72 9.1.3 Integration ...... 72 9.2 Prototype Limitations ...... 74 9.2.1 Layouts ...... 74 9.2.2 Fixing Ellipses’ Lengths ...... 74 9.2.3 When-Conditions ...... 75 9.2.4 Typing-Related Resugaring Failure ...... 75 9.2.5 Layering Syntactic Sugar ...... 75 9.3 Recommendations ...... 76 9.4 Threats to Validity ...... 76 9.4.1 Performance ...... 76 9.4.2 Formal Justification ...... 77 9.4.3 Expressiveness ...... 77

10 Conclusion 79

11 References 80

A Prototype Implementation 82 A.1 Altered Files ...... 82 A.1.1 Operations on patterns and terms ...... 82 A.1.2 Environments ...... 84 A.1.3 Desugaring and Resugaring ...... 84 A.1.4 Bootstrapping and Syntax ...... 84 A.1.5 Semantics ...... 85 A.2 Installation Instructions ...... 85

B Unit Tests 86 B.1 Emulation ...... 86 B.2 Fixing Ellipses Lengths ...... 87 B.3 Node Identity ...... 87 B.4 Output Type ...... 88 B.5 When Conditions ...... 88 B.6 Automatic Test Generation ...... 90 B.7 Duplicates ...... 91

C Banana Algebra Transpiler 93

D Case Studies 102 .1 ES6 to ES5 Resugaring ...... 102

3 D.1.1 Arrow Functions ...... 102 D.1.2 Literals ...... 103 D.1.3 Parameters ...... 104 D.2 Performance Benchmark ...... 106 D.3 Banana Fun Language Performance Benchmark ...... 107 D.3.1 Syntax ...... 107 D.4 Fun to Lambda Calculus ...... 109 D.4.1 Syntax ...... 109 D.4.2 Sugar ...... 110 D.4.3 CustomSugar ...... 115 D.4.4 LambdaData ...... 116 D.4.5 Evaluator ...... 116

4 Chapter 1

Introduction

Krishnamurthi calls for arms for more research on the topic of desugaring, an essential tool to reduce the size of programming language implementations[3]. What is desugaring, and why is it important that desugaring gains more attention? Desugaring is the process of eliminating ”syntactic sugar”. Syntactic sugar is syntax that is used to make a programming language easier for humans to read and write without increasing the language’s functionality[4]. Desugaring transforms constructs containing syntactic sugar into semantically equivalent constructs fundamental to a language processor. The main advantage of this technique is that a language processor can remain relatively small, because it only needs to contain semantics for the desugared code. However, since the original program is transformed into another representation, a language processor may unintentionally produce output that is foreign to what the user has typed. This problem becomes apparent during debugging. When a user debugs a program, the user is often interested in certain aspects of the state of that program at specific moments during its execution. In a functional programming language, for example, the user may be interested in the reduction steps taken by the evaluator. If the initial program was desugared, then these reduction steps may look very different from what the user has typed. This strains the user to debug a program after it is desugared. To address this problem, J. Pombrio and S. Krishnamurthi introduced a technique called ”resugaring”[2][1]. Resugaring is the act of reconstructing the surface level representation (what the user typed) from the core level representation (what the language processor uses). This way, desugared terms can be represented in the language of the user. In this Master’s thesis, we research if the resugaring techniques found throughout literature are sufficiently capable to handle real-world problems, and what the requirements and obstacles are to integrate practical support for resugaring into a readily existing meta-programming language.

5 1.1 Context

The scope of our research is limited to the resugaring techniques found in two papers by J. Pombrio and S. Krishnamurthi, as we are unaware of other formalizations. For addressing real-world problems, we focus on expressiveness, performance and language-related challenges. We do not address hygiene1[5] in full detail. Furthermore, we restrict our research to the use of resugaring techniques within a programming language, as altering an existing language processor to produce trackable2 desugared terms is already discussed[1]. We use Rascal as our subject of study to research the challenges associated with implementing resugaring into a readily existing programming language.

1.2 Motivation

We believe that resugaring is a promising technique to address the problem that desugared terms are often foreign to the user (we discuss other techniques in the Related Work). It is also a novel technique, as it was first formally addressed in 2014[1]. We study resugaring techniques for real-world applications because the papers discussing these techniques do not provide much evidence to show that their techniques are sufficiently capable to address practical real-world problems. Instead, they are more theoretical in nature. Furthermore, we study how resugaring can be integrated into a readily existing language be- cause we are unaware of any programming language that integrates language-level support for resugaring. Although both papers by J. Pombrio and S. Krishnamurthi present a prototype of their resugaring techniques, both are essentially very small domain-specific languages instead of a fundamental part of a larger programming language. We believe that integrating resugar- ing into a larger meta-programming environment opens up many doors, because it allows for these techniques to interplay with other tools for constructing programming languages. This makes Rascal a perfect fit for our study. Rascal is a mature meta-programming work- bench that attempts to integrate all the tools necessary for analyzing and constructing pro- gramming languages[6][7][8]. By using Rascal as our subject of study, we uncover the practical challenges related to resugaring instead of just the theorical ones.

1.3 Research Questions

Central to our study are the following research questions: • Are the resugaring techniques found throughout literature expressive enough for prac- tical applications? • Are the resugaring techniques found throughout literature efficient enough for practical applications?

1Preventing the accidental capture of variables or other identifiers 2By trackable we mean that evaluated terms contain relevant information about their origin.

6 • What are the challenges related to integrating expressive and efficient support for re- sugaring into a readily existing meta-programming language?

1.4 Research Method

Our approach to study the practical challenges related to resugaring was to build a prototype based on a literature study. We tested our prototype against different use cases to determine whether or not our prototype met our goals. We gradually improved upon our prototype until our prototype allowed us to meet our goals. When our prototype was complete, we evaluated our prototype using the cases we used to develop that artefact. Finally, we sketched a number of proofs to support our design. We based our answers to the research questions on our observations and the results of the evaluation. We used the following case studies to build and evaluate our prototype: Performance 1. Desugaring and resugaring a functional programming language to and from the Lambda Calculus respectively 2. Performance benchmarking to measure the differences in the techniques we used Expressiveness 1. Adding support for resugaring to three different cases of Rmonia’s ES6 to ES5 desug- aring mechanisms 2. Construction of an evaluation stepper with support for resugaring for another functional programming language that desugars into the Lambda Calculus

1.5 Contributions

The main contribution of this thesis are: • An analysis regarding the practical use of different desugaring and resugaring techniques • An analysis of the engineering challenges regarding the integration of resugaring into a readily existing programming language • A design for extending Rascal with first-class support for resugaring, supported by a proof sketch of its correctness • A fully functional prototype for resugaring in Rascal

1.6 Related Work

Our work is based on two papers on resugaring by J. Pombrio and S. Krishnamurthi[1][2]. The basis of our work is to a large extent an evaluation and extension of their techniques in a practical context. We discuss their techniques in more detail in Chapter 4.

7 One of the key elements of resugaring is tracking the origin of terms. This is similar to origin tracking as described by A. van Deursen, et al.[9]. Origin tracking refers to relating a transformed term’s subterms to the respective subterms prior to the transformation. The paper by van Deursen, et al. presents a formalization of origin tracking for use in a term rewriting system during evaluation. However, while their work is focused on tracking origins during evaluation, resugaring is focused on the relation between evaluated terms and syntactic sugar. Another technique to relate source code that originates from a transformation to its initial term, is source mapping. Source maps are simply maps from locations in transformed source code to their respective locations prior to transforming. Mozilla Firefox’s Javascript Debugger, for instance, allows the use of source maps to relate minified or transpiled Javascript code to their original source3. This is essentially a subset of the problem we study, because in their case, terms are only related to other terms, whereas resugaring attempts to reconstruct these terms. In other cases, such as in the Lambda Calculus Evaluator from Michael I. Schwartzberg4, terms are simply translated to another representation without origin tracking. The benefit of this technique is that reconstruction is relatively simple and straightforward: simply parse the core term and return the semantically equivalent surface term. However, since terms are not tracked, there is no way of telling whether the output term reflects the input language of the user. The techniques from J. Pombrio and S. Krishnamurthi are essentially well-behaved lenses. A lens is a bidirectional transformation used to create a ”view” from an object that can be updated, thereby consistently updating the original representation as well. A lens consists of two functions, get and put[10]: l % () ⊆ A (Get) l & (A × C) ⊆ C (Put)

And is called well-behaved when it adheres to the Lens Laws: l & (l % c, c) v c for all c ∈ C (GetPut) l % (l & (a, c)) v c for all (a, c) ∈ A × C (PutGet)

Where C is the source, A is the target and f(a) v b is defined as f(a) = ⊥ ∨ f(a) = b (where ⊥ means that there is no valid value). Essentially, the first law for well-behaved lenses states that whenever a view c is updated to its own view value, the value remains the same or is invalid. The second law states that whenever a value is changed in the view, getting that value from the view yields that same value or is invalid. Throughout this thesis we name different benefits of resugaring regarding debugging. Among the techniques that could potentially benefit the most from resugaring, is tracing. Tracing is simply producing information about the state of a program during different moments in its execution. It is sometimes called ”printf debugging”, called after the command in C that is often used to trace. Essentially, printing an evaluation sequence is a form of tracing (which

3See: https://developer.mozilla.org/en-US/docs/Tools/Debugger/How to/Use a source map 4See: http://cs.au.dk/ mis/lambda.pdf

8 we use in one of our case studies). Similar to tracing is postmortem debugging[11], which is essentially debugging a program after it has crashed. As we will see in one of the examples in Chapter 3, postmortem debugging may also benefit from our techniques. Resugaring is one of different research topics on desugaring. As we noticed during the evalu- ation of our prototype, desugaring may lead to an enormous overhead in terms of the amount of produced code. As it turns out, Krishnamurthi named two techniques to address this problem: ”Shrinking output in a semantics-preserving way”[3] and ”Shrinking output by al- tering semantics”[3]. The first technique is straightforward: simply replace bloated and/or inefficient terms with semantically equivalent, smaller and more performant terms. The sec- ond technique is similar to the first, but ignores some very specific edge cases such as the potential use of Javascript’s eval function and Java’s reflection mechanisms. The benefit of ignoring these cases is that there is much more room for deflating terms, but comes at the price of possible semantical inconsistencies. These inconsistencies, however, may easily be avoided when the user is aware of these transformations. A case study by Junsong Li, et al. demonstrates this technique in practice[12].

1.7 Outline

This thesis has the following structure: • In Chapter 2, we discuss concepts that are relevant for understanding this thesis. • In Chapter 3, we explain desugaring and resugaring in more detail. • In Chapter 4, we summarize the different resugaring techniques found in literature. • In Chapter 5, we discuss the design of our prototype from a user perspective. • In Chapter 6, we discuss how our prototype was built and what we observed during this process. • In Chapter 7, we present a proof sketch of our design’s correctness. • In Chapter 8, we proceed to evaluate our prototype. • In Chapter 9, we discuss our findings and observations. • Finally, we conclude this thesis in Chapter 10.

9 Chapter 2

Background

In this chapter, we discuss concepts that are relevant to understanding this thesis.

2.1 Concrete and Abstract Syntax

Throughout this thesis, we occasionally refer to concrete and abstract syntaxes. By con- crete syntax we mean syntax containing information about the textual representation (e.g. whitespaces and layout). By abstract syntax we mean syntax that does not contain this information. We follow Rascal’s syntax for denoting terms. We denote concrete terms by wrapping ‘ and ‘ around the textual representation for untyped terms and (Type)‘ and ‘ for typed terms. For abstract terms, we denote arrays using [ and ], nodes or function calls using n(p1, ..., pn) and maps using {key1 → value1, ..., keyn → valuen}.

2.2 Pattern Matching and Substitution

Every desugaring and resugaring technique discussed in this thesis is based on pattern match- ing and substitution. Pattern matching is the act of strictly matching a term against a pat- tern, producing a map of pattern variable names to the respective subterms. Substitution in this context is the act of replacing the variables in patterns by values, thereby producing a term. Throughout this thesis, we use patterns that are similar to terms, but contain pattern variables that may be bound to respective (sub)terms after matching a term. We use untyped pattern variables that match arbitrary terms, typed pattern variables that match equally-typed terms and (typed) ellipses that match zero or more terms. We follow Rascal’s syntax for denoting patterns. We denote concrete patterns using (Type)‘ and ‘, untyped pattern variables in abstract patterns using italics, and constants in abstract patterns using double quotes (””). We denote constants in concrete patterns using italics,

10 since we do not consider untyped pattern variables in concrete syntaxes. We use the asterisk- token (*) to denote ellipses allowing zero or more terms, and the plus-token (+) to denote ellipses allowing one or more terms. Typed variables in concrete patterns are written using ‘‘ and in abstract patterns using Type name. Finally, we denote concrete ellipses with a delimiter as {Type ”constant”} followed by either a + or a ∗. We now provide some examples of patterns to illustrate how we denote patterns throughout this thesis and how they can be matched against terms. • (Exp)‘ + ‘ matches the term ‘1 + 1‘ if ‘1‘ is of the type Exp, after which both e1 and e2 are bound to 1. • (Exp)‘+‘ does not match the term ‘1−1‘, because the minus-token is different from the plus-token in the pattern. • [”a”, ”a”, v] is an abstract pattern containing an array of two constants ”a” and a pattern variable v. This pattern matches the term [”a”, ”a”, ”a”] where v is bound to ”a”, but does not match the term [”a”, ”a”, ”a”, ”a”]. • [”a”, v∗, ”a”] matches the term [”a”, ”a”, ”a”], where v is bound to [”a”], and also matches the term [”a”, ”a”, ”a”, ”a”], where v is bound to [”a”, ”a”]. • ‘function({Var ”, ”}+ var)‘ matches the term ‘function(arg)‘ (if arg is of the type Var), matches the term ‘function(arg1, arg2)‘, but does not match the term ‘function()‘. Finally, we denote pattern matching using t/P , meaning that the term t is matched against a pattern P . We denote substitution as P σ, in which the pattern variables in P are replaced by variables in the variable map σ (a variable map is a map from variable names to variable values).

2.3 Rascal

Throughout this thesis we provide examples using Rascal, and explain which mechanisms we altered to support resugaring. The most important facets of Rascal we discuss are Rascal’s typing system, syntax declarations, function declarations and annotations. As such, we pro- vide a brief overview of the topics we discuss throughout this thesis, and illustrate how they apply to Rascal using simple examples.

2.3.1 Static Typing

Rascal is designed to be statically typed. However, this is currently not enforced, and types are checked dynamically.

2.3.2 Syntax Declaration and ADTs

Rascal allows the user to define concrete syntax declarations and abstract datatypes (ADTs). Concrete syntaxes are defined using a syntax declaration. For example:

11 1 syntax Exp = plus: Num n1 "+" Num n2;

Lexical tokens are defined using a lexical declaration. For example:

1 lexical Num = [0-9]*;

These two simple definitions allow the user to write a term that adds two numbers, for example (Exp)‘10 + 2’. Abstract data types are defined using the data declaration. For example:

1 data Exp = plus(Exp, Exp) | num(int n);

This abstract data type allows for multiple numbers to be added, e.g. plus(plus(num(1), num(1)), num(2)).

2.3.3 Functions

The simplest way in Rascal to define a function is using an expression function declaration. For example:

1 int add(int m, int n) = m + n;

We can also add so-called when-conditions to expression functions, which are conditions that have to be met prior to the function’s execution. For example:

1 int add(int m, int n) = m + n 2 when m > 0, n > 0;

Another way to define a function is by using a statement block. For example:

1 int add(int m, int n) { 2 if (m > 0 && n > 0) { 3 return m + n; 4 } 5 fail; 6 }

This function is semantically equivalent to the function in the previous example. A function’s signature may contain patterns that have to be matched prior to calling a func- tion. For example (note the plus and num constructors in the function’s signature):

1 int add(plus(num(int m), num(int n)) { 2 if (m > 0 && n > 0) { 3 return m + n; 4 } 5 fail; 6 }

Here we use the previously defined ADT of plus. However, we can do the same for concrete patterns. For example:

12 1 int add((Exp)‘ + ‘) { 2 int m = toInt(""); int n = toInt(""); 3 if (m > 0 && n > 0) { 4 return m + n; 5 } 6 fail; 7 } when-conditions may also contain matching conditions. For example:

1 int add(p) = m + n 2 when plus(int m, int n) := p;

Here, := is the matching operator that returns true if a match is succesful and false otherwise (using the pattern on the left-hand-side and an expression producing a term on the right- hand-side). If a match is succesful, it will bind the variables found through matching to the environment.

2.3.4 Annotations

Annotations are essentially ”transparent” datatypes that can be attached to a term. By transparent we mean that programs in Rascal are oblivious to annotations unless specifically targeted. Furthermore, annotations are completely ignored during pattern matching. Prior to annotating a term, the annotation attached to a type needs to be declared using the anno-token. For example:

1 data Exp = plus(Exp, Exp) | num(int n); 2 anno str Exp @ label;

We can now attach the label annotation to terms of the type Exp. Annotations can be set on a value using variable[@key = value], and can be retrieved from a term using variable@key. For example:

1 Exp p = plus(num(1), num(1)); 2 p = p[@label = "1 + 1"]; 3 println(p@label);

This example produces the string ”1 + 1”. Note that we have to set p after assigning the annotation to p because all data in Rascal is immutable.

2.4 Lambda Calculus

Throughout this thesis, we provide examples in terms of the Lambda Calculus. Further- more, we evaluate our design using two artefacts that desugar and resugar into the Lambda Calculus.

13 The Lambda Calculus is a calculus invented by Alonzo Church in 1936. Using the Lambda Calculus, he was able to prove that there is no general solution for the Entscheidungsproblem (the decision problem)[13][14]1. The Lambda Calculus consists of expressions that are composed of Lambda functions, ap- plications and variables. Lambda functions are functions that take a number of arguments and a Lambda expression, and are written as: ‘λp1...pn.E‘, where E is the function’s internal expression and p1...pn are its parameters (which are variables).

Applications are written as ‘E1 E2‘, in which E2 is ”applied” onto E1. An application on a Lambda function (i.e. a reducible expression or redex) means that every occurence of the leftmost parameter in the Lambda function’s expression can be substituted by E2 and then removed from the list of variables. This process is called a β − reduction. For example, the Lambda expression (λx.x)y becomes y after β − reduction since x-s are substituted by y. However, (λxy.x)x becomes λy.x, since (λxy.x)x is essentially a shorter way of writ- ing (λx.λy.x)x and (λx.(λy.x))x. (Note that expressions are always bound to their closest Lambda function.) Note that if we β − reduce (λxy.x)y, we get λy.y, for which the inner y is wrongly bound to parameter y of the Lambda function. This is called variable capture. To avoid variable capture, Lambda Calculus relies on capture avoiding substitution, in which substition may only occur if no variable capture can occur. To allow a redex to be reduced that cannot be reduced due to capture avoiding substitution’s constraints, a process called α-renaming may be used to rename variables (i.e. variables which are used by ancestor Lambda function parameters). In the example above, we can rename the right-most y to z, giving us (λxy.x)z (since y was a free variable), which can then safely be reduced to λy.z. We could also change the variables in the Lambda function, but this requires that all bound variables in that expression must be changed as well. For example, λx.x can be changed to the α-equivalent expression λy.y, whereas two expressions are called α-equivalent when they can be made equivalent using α-renaming. Finally, note that the Lambda Calculus only contains variables, functions and applications - and as such does not contain numbers, booleans, nodes, etc. However, data can simply be encoded into the Lambda Calculus using Church encoding. For example, whole numbers can simply be represented using ”Church Numerals”, in which λfx.f nx represents the whole number n.

1This was independently proven in the same year by Alan Turing using the so-called Turing Machine, which turned out to be the foundation of modern world computers[15].

14 Chapter 3

Desugaring and Resugaring

In the Introduction, we explained why we need desugaring and resugaring. In this chapter, we address this matter in more detail using different examples. The goal of this chapter is to provide a better insight into the problem domain.

3.1 Desugaring

Desugaring is essentially the process of transforming a term in the surface level representation to a term in the core level representation. By surface level representation we mean the representation of semantics in terms of the language that the user typed. By core level representation we mean the representation of semantics used by a language processor. Syntactic sugar can be used to extend a language or to build a language on top of another language. For example, we can add syntactic sugar to Javascript to allow for an unless- statement.

1 unless (i == 0) { 2 console.log("i is not 0"); 3 }

If the Javascript processor does not contain semantics for processing unless-statements, then we can use desugaring to transform this term into a semantically equivalent term that our Javascript processor does understand.

1 if (!(i == 0)) { 2 console.log("i is not 0"); 3 }

Similarly, if our hypothetical Javascript evaluator is unable to process for-loops, then we could use desugaring to rewrite for-loops (shown in the left side of the following block) into while-loops (shown in the right side of the following block):

15 1 var i = 0; 1 for (var i = 0; i < 5; i++) { 2 while (i < 5) { 2 doSomethingWith(i); 3 doSomethingWith(i); 3 } 4 i++; 5 }

Less trivial, CoffeeScript may be considered syntactic sugar for Javascript. In the following example, we see code written in Coffeescript on the left side and the semantically equivalent ”transpiled”1 code on the right side.

1 var Car; 2 3 Car = (function() { 4 function Car() {} 5 6 Car.prototype.start = function() { 1 class Car 7 return alert("Vroomm!!"); 2 start: -> 8 }; 3 alert "Vroomm!!" 9 10 return Car; 11 12 })(); 13 14 // --- 15 // generated by coffee-script 1.9.2

3.2 Benefits of Desugaring

If we look back at the previous examples, we can notice a pattern. Whenever a language processor does not contain semantics for constructs that are expressible using other constructs, we can use desugaring to implement support for them. This means that desugaring can sometimes be used instead of semantics code, ultimately leading to higher maintainability and lower production costs[16]. Note that we claim that desugaring thus leads to less code than adding semantics to a language processor. Although this claim may not be true in every case, consider the following example: extending a programming language to support octal numbers. If we allow for a language to use octal numbers besides decimal numbers without utilizing desugaring, this means that we have to add semantics for every operation in a similar way as decimal numbers. As such, we have to implement semantics for addition, subtraction, multiplication, division, etc. However, if we would implement octal numbers using desugaring instead, we would only have to implement a term rewriting system[17] from octal numbers to decimal numbers. Now, consider an even simpler example. Say we want to add support for ‘T‘ and ‘F‘ to be used in a language next to ‘true‘ and ‘false‘. If we would implement semantics for these constructs without utilizing desugaring, we would have to add support for these constructs

1Transpilation is the process of transforming source code from one language into another.

16 for the and-, or-, xor-, conditional branching, and many other operations. However, if we would use desugaring instead, it could be as simple as implementing the transformations T → true and F → false.

3.3 The Need for Resugaring

Until now, we have only demonstrated what desugaring is and why it can be useful. We will now provide different scenarios to illustrate why resugaring is needed to utilize desugaring effectively. When we extend a programming language using desugaring, the terms based on desugaring are transformed into another presentation. For example, if we extend Javascript with the ‘unless (condition)‘-statement, then this term can be transformed into ‘if (!(condition))‘. As such, if we try to run the program:

1 unless (x) { 2 console.log("y"); 3 }

The interpreter will throw an error message in terms of the transformed programming lan- guage:

1 /home/wouter/master-thesis/example-desu/test.js:1 2 (function (exports, require, module, __filename, __dirname) { if (!(x)) { 3 ^ 4 5 ReferenceError: x is not defined 6 at Object. (/home/wouter/master-thesis/example-desu/test.js:1:69) 7 at Module._compile (module.js:434:26) 8 at Object.Module._extensions..js (module.js:452:10) 9 at Module.load (module.js:355:32) 10 at Function.Module._load (module.js:310:12) 11 at Function.Module.runMain (module.js:475:10) 12 at startup (node.js:117:18) 13 at node.js:951:3

Notice that the error is referring to ‘if(!(x))‘, while this is not what the user has typed. Now, in this case, the user may simply mentally relate ‘if(!(x))‘ to ‘unless(x)‘) as it is obvious where this term came from. Or, we could simply use source maps (as discussed in the Related Work) to solve this problem. However, consider the following example. We use Michael I. Schwartzberg’s Functional pro- gramming language to Lambda Calculus evaluator2 to evaluate the term let x = 5 in if (true) x else 0 (the \ represents a Lambda function).

1 echo "let x = 5 in if (true) x else 0" | java Lambda -compile | java Lambda -evaluate -trace 2 (\x.(\xy.x)(\a.xa)(\a.(\fx.x)a))(\fx.f(f(f(f(fx))))) 3 ->

2From: http://cs.au.dk/ mis/lambda.jar

17 4 (\xy.x)(\a.(\fx.f(f(f(f(fx)))))a)(\a.(\fx.x)a) 5 -> 6 (\ya.(\fx.f(f(f(f(fx)))))a)(\a.(\fx.x)a) 7 -> 8 \a.(\fx.f(f(f(f(fx)))))a 9 -> 10 \ax.a(a(a(a(ax)))) 11 -> 12 \ax.a(a(a(a(ax))))

In this case, we get a very different representation than what the user has typed. It is much more difficult for the user to mentally relate the output terms to the input than in the previous example. The user probably wanted to see the following evaluation sequence instead:

1 let x = 5 in if (true) x else 0 2 if (true) 5 else 0 3 5

Note that in this case, we cannot use a source map for relating the terms in the evaluation sequence above to the input terms. The x in the original term is replaced by 5 in the second term in the evaluation sequence. Therefore, no subterm of let x = 5 in if (true) x else 0 is equivalent to if (true) 5 else 0. As such, we need a different solution to address this problem. Our options are: 1. Instead of using desugaring, implement semantics for each of the expressions that could not be evaluated; 2. Naively transform desugared terms to the surface-level representation; 3. Reconstruct the surface level representation from the core level representation using origin data. Implementing semantics for the new commands yields the problem that it requires a lot of effort to implement and maintain as we illustrated in the previous section. As for considering the second option, let us illustrate why this will not work using a simple example: transforming decimal numbers into Lambda Calculus encoding. If we would simply transform terms of the form λfx.f nx to n, then if a user provides the input ‘4‘, this will work just fine. However, if the user types ‘λfx.f(f(f(fx)))‘ (which is the equivalent desugared term of ‘4‘), then the output still yields 4, which is again foreign to what the user has typed. This leaves us with the third option: reconstructing the surface level representation from the core level representation using origin data. Note that this is resugaring. Using resugaring, we only need to specify the desugaring transformations under certain constraints (as we will demonstrate throughout this thesis). We do not need to add any semantics and we eliminate the problem that terms are foreign to what the user has typed.

18 Chapter 4

Resugaring Techniques

In this chapter, we provide a short summary of the two resugaring techniques by J. Pombrio and S. Krishnamurthi, as they form the basis for our research and prototype.

4.1 Resugaring: Lifting Evaluation Sequences through Syn- tactic Sugar

Central to J. Pombrio and S. Krishnamurthi’s approach in their first paper on resugaring, there are three properties to which their techniques adhere to, defined as: 1. Emulation Each term in the generated surface evaluation sequence desugars into the core term which it is meant to represent. - [1] 2. Abstraction Code introduced by desugaring is never revealed in the surface evaluation sequence, and code originating from the original input program is never hidden by resugaring. - [1] 3. Coverage Resugaring is attempted on every core step, and as few core steps are skipped as possible. - [1] The techniques described in this paper are based on pattern matching and substitution, and operate on a list of rules rs of the form P1 → P2. The essence of their desugaring technique is that for each subterm in a term T (traversed from the top-down), this subterm is recursively replaced by its expansion until this term can no longer be expanded. This is illustrated by Figure 4.1. Expansion is the operation of trying to apply a rule in the transformation rule list rs onto the provided term: If there is a rule P1 → P2 in rs for which P1 matches this term, select the first rule P1 → P2 for which this is the case. Then, match the rule’s P1 with this term. This induces an environment σ in which the variables in P1 are bound to the corresponding subterms of the matched term. This means that if a term 5 + 2 is matched against a pattern v1 + v2, this induces the environment σ = {v1 → 5, v2 → 2}.

19 f 0 f 00 f

g h g h g h Expanded f → f 0, desugaring Expanded f 0 → f 00, desugar- Desugaring f. f 0. ing f 00. f 00 f 00 f 00

g h g h g h

Expansion of f 00 failed, desug- Expansion of g failed, desug- Expansion of h failed, desug- aring g. aring h. aring succesful.

Figure 4.1: Illustration of the traversal of desugaring f(g, h) with transformation rules rs = [f → f 0, f 0 → f 00]. With ”failed” we mean that a term cannot be expanded.

After matching a pattern P1 in rs to a term, the induced environment σ is then applied onto P2. This way, the variables in P2 are substituted by the variables in σ, thereby producing the desugared term. Expansion then returns a term in which every subterm is tagged by an indicator that this term is the result of desugaring (the so-called Body-tag). The root of this term is tagged by the rule index (using the so-called Head-tag) and the original term. By storing the index-variable, the algorithm ensures that the algorithm eventually resugars using the right transformation rule. The process of resugaring (see Figure 4.2) consists of traversing the input term from the bottom-up. For each term that can be replaced by its unexpanded form, the term is unex- panded. Unexpansion, then, is the process of applying the rule rsindex (for which index is stored in the Head-tag) onto the term in reverse. The variable map found by matching the term with P2 is thus used for substituting the variables in P1 to produce the resugared term. In some cases, P2 contains a strict subset of the variables found in P1. In this case, variables that are present in P1 but not in P2 are found by matching the original term with P1, and adding the missing variables to the variable map used for resugaring. When the result does not contain any tags, it is considered to be succesfully resugared, since no terms from the desugared term are present in the resugared term. As such, the paper describes how an evaluation stepper can be obtained. The evaluation stepper desugars the original input term and while this term can be reduced, the stepper attempts to resugar the reduced term. If resugaring of the reduced term is possible, the result is emitted.

20 f 00 00 f 00 f

g h g h g h Unexpansion of h failed, re- Unexpansion of g failed. Re- Resugaring h. sugaring g. sugaring f 00. f 0 f f

g h g h g h f 00 unexpanded to f 0. Resug- f 0 unexpanded to f. Resugar- Unexpansion of f failed. Re- aring f 0. ing f. sugaring succesful.

Figure 4.2: Illustration of the traversal of resugaring f 00(g, h) with transformation rules rs = [f → f 0, f 0 → f 00]. With ”failed” we mean that a term cannot be unexpanded.

4.2 Hygienic resugaring of compositional desugaring

In the second paper of J. Pombrio and S. Krishnamurthi, ”Hygienic resugaring of composi- tional desugaring”, the approach to desugar and resugar terms is different. First of all, most of the properties on which this paper is based are different from the properties in the first paper. Most notably, the properties in the first paper are more generalized, whereas the properties in the second paper are more specified to the techniques themselves. The first property (Emulation) is similar to the first property of the first paper, stating: Every surface term desugars to (a term isomorphic to) the core term it purports to represent. - [2] Here, isomorphic refers to a morphism (essentially a structure-preserving map of objects to other objects as defined in category theory) that is invertible (meaning that it is possible to ”undo” the transformation). The second property (Abstraction) states: If a term is shown in the reconstructed surface evaluation sequence, then each non-atomic part of it originated from the original program and has honest tags. (Assuming that evaluation does not modify tags.) - [2] Terms have honest tags if each tagged subterm is unexpandable (i.e. unexpansion does not fail). The third and fourth properties are more formal in nature. The third property (Hygiene) states that if two terms are α−equivalent, then they are also α−equivalent after desugaring

21 and after resugaring. Finally, the fourth property (Coverage) is basically a formalization of the Coverage property found in the previous paper. Another major difference is that this technique allows for desugaring using Turing-complete functions. These functions consume and produce a pattern. The consumed pattern C is the pattern matched during expansion, and the produced pattern C0 is the pattern representing the desugared term.

+ ‘ := ‘(5 + 6) + 12‘ plus(plus(5, 6), 12)

+ ‘ := ‘(5 + 6)‘ ‘ := ‘12‘ plus(5, 6) 12

Figure 4.3: Illustration of desugaring using the techniques from ”Hygienic resugaring of com- positional desugaring”.

After each expansion, the consumed and produced patterns are stored in a tag (C → C0), and the pattern variables in the produced pattern are substituted by the desugaring of the respective subterm they match. Whenever a term cannot be expanded, the subterms of that term are then desugared. As such, desugaring proceeds in a top-down fashion (see Figure 4.3). Also resugaring proceeds in a top-down fashion (see Figure 4.4). For each term that is tagged with the tag C → C0, the term is unexpanded by matching the term against C0, resugaring the bound variables and substituting the result with C. When a term cannot be unexpanded, the subterms of that term are then resugared.

plus(x, y) := plus(plus(5, 6), 12) ‘(5 + 6) + 12‘

plus(x, y) := plus(5, 6) int i := 12 ‘(5 + 6)‘ ‘12‘

Figure 4.4: Illustration of resugaring using the techniques from ”Hygienic resugaring of com- positional desugaring”.

Furthermore, the techniques described in this paper are hygienic. By this we mean that when a term is desugared, first the term is resolved. This means that the abstract syntax representation (AST) is transformed into an abstract syntax directed acyclic graph (ASD) representation. An ASD is similar to an AST. However, in contrast to an AST, every variable unambiguously refers to the location that the variable is declared and every node in the ASD has identity. Since each node has identity, there is no ”node capture” (accidentally matching the wrong node) during pattern matching. During resugaring, the ASD ”unresolves” into an AST representation. Whenever there is the

22 possibility of variable capture, the algorithm renames the variable. Resolution and unreso- lution are based on Romeo’s binding algebra[18] and a set of scoping rules. As hygiene is outside of the scope of this thesis, we do not go into depth about this topic.

23 Chapter 5

Resugaring in Rascal

In this chapter, we present our design for extending Rascal with support for resugaring. More information about the implementation can be found in Appendix A. We accompanied unit tests for our prototype in Appendix B.

5.1 Desugaring and Resugaring

We extended Rascal to allow for desugaring using the desugar-all command. This command takes a function name f and an expression e and traverses the term t found after interpreting e in a top-down fashion. Whenever a term is expandable using f, it will apply f on this term and stop traversing the current path. As such, f will then become responsible for desugaring the current term and all of its subterms. Resugaring in Rascal works in a similar way as desugaring. The resugar-all command traverses the term t found after interpreting e in a top-down fashion. Whenever a term is annotated with an unexpansion function, this function is called using the current term. Whenever this function fails to execute, the current execution stops and the original input term t is returned. Whenever a term is succesfully resugared, resugaring will stop traversing its current path, again making the called function responsible for resugaring the current term and all of its subterms. We used named functions for desugaring to allow for different syntactic sugar declarations throughout a program. For example, a typechecker may require a different set of transfor- mations than a . As such, a typechecker may use desugar-all, whereas a compiler may use desugar-all.

5.2 Sugar Function Declarations

We allow users to define three different types of syntactic sugar transformations, depending on the required strategy: 1. Intermediate sugar function declarations;

24 2. Compositional sugar function declarations; 3. Custom sugar function declarations. The first two types of sugar function declarations use a syntactical notation similar to that of Rascal’s expression function declarations, as we will see in the following sections. The third type of sugar function declaration is simply a Rascal function.

5.2.1 Intermediate Sugar Function Declarations

Intermediate sugar functions are functions similar to J. Pombrio and S. Krishnamurthi’s first technique in which each term is repeatedly expanded. Intermediate sugar functions can be declared as follows1:

1 CorePatternType func(SurfaceExpPat) * ⇒ CoreExpPat;

Here, func is the name of the function and CorePatternType represents the type of the output of the desugaring transformation. SurfaceExpPat and CoreExpPat are patterns (which can be matched) that also act as expressions (which can be evaluated to a term). Note that this notation represents the basic notation for declaring intermediate sugar functions. We discuss extensions of this notation later throughout this chapter. When this function is called during desugaring, the current term is matched against Surface- ExpPat. If this term succesfully matches this term, this will induce an environment in which the variables of the pattern are bound to the respective subterms. The variables in Core- ExpPat are then substituted by the variables in the environment, creating the desugared (intermediate) core term2. This term is annotated with a function that allows for inverting the transformation. Finally, the desugar-all command is called on the output term, which will try to expand the output term once more or proceed with the subterms if expansion is not possible.

5.2.2 Compositional Sugar Function Declarations

Compositional sugar functions transform the input term in a similar way as J. Pombrio and S. Krishnamurthi’s second technique. Compositional sugar functions can be declared as follows:

1 CorePatternType func(SurfaceExpPat) ⇒ CoreExpPat;

Note that the only difference between the basic notation of this declaration and the basic notation of intermediate sugar function declarations is the absence of the asterisk before the ⇒-sign. When compositional sugar functions are used, SurfaceExpPat is matched against the input term after which the variables in SurfaceExpPat are bound to the environment. It then 1Implementation Detail: We also support different levels of module visibility, similar to expression functions. 2In some cases, CoreExpPat contains a subset of the variables used in SurfaceExpPat. When that happens, the original values of the variables in SurfaceExpPat are used to fill in these gaps.

25 proceeds to desugar the variables in CoreExpPat using desugar-all. When these variables are desugared, the variables in CoreExpPat are substituted by the values in the desugared variables, yielding a term. This term is finally annotated with a function that allows for inverting the transformation.

5.2.3 Custom Sugar Function Declarations

In some cases, the previously defined sugar functions simply do not suffice. As such, it is important to allow users to tailor their own strategies. Therefore we allow Rascal functions to be used for desugaring as well, provided that users are well-aware that they are themselves responsible for adhering to the properties central to resugaring. Nevertheless, we do prove that custom functions are capable of performing consistent and correct transformations in Chapter 7. Custom sugar functions should return a term annotated with an inverse function. The anno- tation for this function is @ resugarFunction. This function accepts a term as its first and only argument. For example:

1 resugarable Exp; 2 3 ... 4 5 LambdaData sugar(original:(Exp)‘a‘) { 6 str nodeId = arbitraryIdentifier(); 7 return (Exp)‘b‘[@__resugarFunction = 8 (Exp) (desugared:(Exp)‘b‘) { 9 if (desugared@__nodeId != nodeId) fail; 10 return (Exp)‘a‘ <<< original; 11 }][@__nodeId = nodeIdentifier]; 12 }

This function accepts an expression containing ‘a‘ and returns ‘b‘. When it is resugared, it accepts ‘b‘ and returns ‘a‘. It is difficult to understand what is going on in this code by simply reading it. As such, let us take a closer look at this code example. First of all, we use the ”resugarable Exp” declaration. This declaration allows Exp to use the annotations that are used for desugaring and resugaring: nodeId and - resugarFunction. Second, we manually set and check the values for nodeId annota- tions. This annotation is used during resugaring to check if the (sub)term’s identities match. Finally, we use the <<< operator. This is an operator we added to Rascal to match a term (on the right-hand side) against a pattern (on the left-hand side) and substitute the pat- tern’s variables with variables in the environment. This way, all terms that are not bound to a pattern variable remain unchanged, thereby ensuring that no information is lost during

26 resugaring.3

5.3 Fixing the Lengths of Ellipses

Recall that ellipses are variables consisting of zero or more terms. As we observed during the evaluation of resugaring ES5 to ES6, ellipses may cause ambiguity problems during re- sugaring (we discuss this problem in more detail in the following chapter). As such, we present a mechanism to ensure that the length of ellipses can be constrained to a certain length. This mechanism can simply be used by prepending a sugar function declaration with @fixedLength{name}. This fixes the length of ellipses after the expansion of a term. For example:

1 @fixedLength{bef} 2 @fixedLength{rest} 3 Function functionSugar((Function)‘function (<{Param ","}* bef>, = , <{Param ","}* rest>) { }‘) 4 * ⇒ (Function)‘function(<{Param ","}* bef>, , <{Param ","}* rest>) { 5 ’ 6 ’ 7 ’}‘ 8 when Statement initBody := defaultParameter( pr, defVal, size((Params)‘<{Param ","}* bef>‘) );

This example comes from the ES6 to ES5 case study (see chapter 8.2). In this case, the length of both ellipses bef and rest are fixed. Note, however, that (in this case) it is sufficient to only fix one of the lengths of the ellipses, because if one ellipsis’ length is not fixed, its length can be derived from the other (fixed) ellipsis.

5.4 When-conditions

The techniques described in J. Pombrio and S. Krishnamurthi’s second paper[2] assume that we have full control over patterns. Unfortunately, this is not the case in Rascal. To overcome this limitation, we allow for so-called when-conditions, similar to when-conditions in Rascal’s expression functions. This allows us to gain a similar level of expressiveness without breaking abstraction. We discuss this design decision in more detail in the following chapter. when-conditions allow users to conditionalize a desugaring transformation. For example, we can define the transformation from a concrete addition to its abstract form as: 3Implementation detail: While Rascal also has a field-update operation in which a single field of a term can be updated, this operation is not able to traverse through a term’s subterms. This makes it difficult to adjust complex terms. Furthermore, syntaxes containing terms that can be addressed using field-update must be labeled, requiring the user to alter their syntax definitions or look up these labels every time they need to perform a field-update. These practical issues do not apply to the substitution operator, since the user can simply specify the pattern containing the variables that need to be substituted. However, the substitution operator can be removed without any further consequence, by simply removing the substitution operator from src/org/rascalmpl/library/lang.rascal.syntax/Rascal.rsc and bootstrapping the syntax (i.e. regenerating the parser).

27 1 ExpData func((Exp)‘ + ‘) ⇒ plus(ExpData e1, ExpData e2) 2 when transformToAST;

In this case, we allow this function to be called if and only if transformToAST is set to true. There are other interesting cases that we can use when-conditions for as well. For instance, we can use when-conditions to inject variables into the core pattern/expression using the matching operator:

1 @ensureUnchanged{tmp} 2 Exp func((Statement)‘swap , ‘) 3 ⇒ (Statement)‘{ = e1; e1 = e2; e2 = ; }‘; 4 when Exp tmp := uniqueName();

In this case, tmp in the statement on the right-hand side of the sugar function is set to a vari- able generated by the uniqueName() function. Note that we use the @ensureUnchanged- tag to ensure that tmp cannot be altered after desugaring (we discuss this requirement in more detail in Chapter 9.2.3).

5.5 Resugaring Fallback Mechanism

Some terms may not be resugared in contrast to the author’s intention. For example, if an interpreter accepts the term ‘2 + 2‘, and desugars and executes this term using the Lambda Calculus, the interpreter may reduce this term to ‘λfx.f(f(f(fx)))’. Since the original term consisted of ‘2‘ and ‘+‘ tokens, the term cannot be resugared to ‘4‘, since it cannot be expressed using terms in the input. Therefore, we added support for a Turing-complete fallback mechanism for handling such cases. The mechanism is similar to exception handling mechanisms found in languages such as Java. The user may declare a throws-clause and the names of the ”exceptions” it throws in sugar function declarations. These exceptions are automatically thrown whenever a term cannot be resugared. For example:

1 Exp func((Exp)‘plus(2, 2)‘) throws MaybeInteger ⇒ ...

Now say we have a sugar function that desugars 4 into the Lambda encoding equivalent.

1 Exp func((Exp)‘4‘) ⇒ (Lambda)‘\fx.f(f(f(fx)))‘

By simply adding ”catch MaybeInteger”, we allow for transformations that catch the excep- tion to process the term in reverse. As such, the following declaration is able to catch a term with a Lambda Encoding semantically equivalent to 4, and resugar that term into ‘4‘:

1 Exp func((Exp)‘4‘) catch MaybeInteger 2 ⇒ (Lambda)‘\fx.f(f(f(fx)))‘

Whenever this is insufficient, for example if we want arbitrary Lambda-encoded numbers to be resugared, we can also use custom fallback functions.

1 Exp func("MaybeInteger", Lambda l) = convertLambdaToExp(l);

28 Here, the first argument is the name of the exception and the second argument is the input term. convertLambdaToExp represents the function to convert a Lambda encoding to the surface representation.

5.6 Break-out functions

Different sets of syntactic sugar transformations may be relevant to different parts of a pro- gram. Therefore, we allow sugar functions to ”break out” to other functions. Both intermediate and compositional sugar functions allow for break-out functions. For in- termediate sugar functions, we can specify its break-out function as follows:

1 CorePatternType func(SurfaceExpPat -> Breakout) * ⇒ CoreExpPat;

After an expansion is succesfully performed using an intermediate sugar function with a break-out function, the function specified in the Breakout-argument is used for subsequent desugarings (relative to the expanded term). For example:

1 Exp sugar((Exp)‘ + ‘ -> numberSugar) * ⇒ plus(e1, e2)

In this example, we use numberSugar to desugar the term found after expansion. Break-out functions for compositional sugar functions are specified in a different way. In- stead of specifying one function for subsequent desugarings, the user may specify a break-out function for each of the pattern variables that are desugared.

1 CorePatternType func(SurfaceExpPat | v1 -> f1, v2 -> f2, ... ) ⇒ CoreExpPat;

Here, v1 and v2 refer to the respective pattern variables, and f1 and f2 refer to the respective functions. For example:

1 Exp sugar((Exp)‘ + ‘ | e1 -> numberSugar1, e2 -> numberSugar2) * ⇒ plus(e1, e2)

In this example, e1 and e2 are desugared using numberSugar1 and numberSugar2 re- spectively.

5.7 Usage Examples

Now that we have specified our design, we will demonstrate how our prototype can be used in practice using simple examples. The purpose of this section is to gain some familiarity with the syntax of our prototype before we discuss the evaluation of our prototype in later chapters. Furthermore, we begin to illustrate what the differences are between intermediate and com- positional sugar functions, as it is important to understand why we use both. In the following chapter, however, we discuss their differences in more detail.

29 5.7.1 Basic Usage

As we discussed earlier, the notation for sugar function declarations are similar to expression function declarations. The main difference is that, instead of using an ”=”-sign between the declaration and the output expression, we use either ⇒ or ∗ ⇒ for respectively a compositional or intermediate sugar function declaration. For example, say we want to desugar i + + to i = i+1. We can write this as the combination of a syntactical definition and a compositional sugar function as:

1 syntax Num = [0-9]+; 2 syntax Const = [a-zA-Z]+; 3 4 syntax Exp = Const c "++" 5 | Const c "=" Const c "+" Num n; 6 7 Exp sugar((Exp)‘++‘) ⇒ (Exp)‘ = + 1‘;

However, we can also use an intermediate sugar function instead, by replacing the last line with:

1 Exp sugar((Exp)‘++‘) * ⇒ (Exp)‘ = + 1‘;

Using either of these declarations, we can now desugar (Exp)‘i++‘ using desugar-all:

1 rascal> term = desugar-all

Which results in:

1 Exp: ‘i = i + 1‘

Finally, we can simply resugar this term using resugar-all:

1 rascal> resugar-all

Which results in:

1 Exp: ‘i++‘

5.7.2 Intermediate Versus Compositional Desugaring

The desugaring strategies used by intermediate and compositional sugar functions are funda- mentally different. To illustrate this difference, consider the following example:

1 syntax Exp = "a" | "b" | "c";

Using compositional desugaring, if we define the transformations a → b and b → c using:

1 Exp sugar((Exp)‘a‘) ⇒ (Exp)‘b‘; 2 Exp sugar((Exp)‘b‘) ⇒ (Exp)‘c‘;

We get:

1 rascal> desugar-all 2 Exp: ‘b‘

30 While if we use intermediate desugaring:

1 Exp sugar((Exp)‘a‘) * ⇒ (Exp)‘b‘; 2 Exp sugar((Exp)‘b‘) * ⇒ (Exp)‘c‘;

We get:

1 rascal> desugar-all 2 Exp: ‘c‘

The difference in the output is caused by the fact that intermediate desugaring repeats the process of desugaring, whereas compositional desugaring simply proceeds to the variables in the pattern. As we will discuss later throughout this thesis, compositional desugaring is generally faster than intermediate desugaring. Therefore, it is sensible to use compositional desugaring whenever possible. In many cases, compositional sugar functions can be rewritten to yield the same output as intermediate sugar functions. For example, we can simply rewrite the rules above to:

1 Exp sugar((Exp)‘a‘) ⇒ (Exp)‘c‘; 2 Exp sugar((Exp)‘b‘) ⇒ (Exp)‘c‘;

The outcome of these transformations are equivalent to the transformations of the interme- diate sugar functions described above. In some cases, however, it is not possible to use com- positional desugaring, as we will discuss in Chapter 6 and demonstrate in Chapter 8.2.

5.7.3 Desugaring Using Different Types

Another useful use case of compositional sugar functions is that we can desugar terms into terms of a different type. In fact, we can even desugar a concrete term into an abstract representation. For example, given the declarations:

1 syntax HelloWho = "World" | "Rascal"; 2 syntax Exp = "Hello, " HelloWho g; 3 4 data AHelloWho = aWorld() | aRascal(); 5 data AExp = aHello(AHelloWho who); 6 7 AExp sugar((HelloWho)‘Rascal!‘) ⇒ aRascal(); 8 AExp sugar((HelloWho)‘World!‘) ⇒ aWorld(); 9 AExp sugar((Exp)‘Hello, ‘) ⇒ aHello(AHelloWho h);

We can now desugar this into an abstract representation:

1 rascal>desugar-all 2 AExp: aHello(aWorld()[ 3 @__nodeId="0ac9a2a3-37de-4e6c-9024-53ff54d26aa2", 4 @__resugarFunction=func( 5 value(), 6 [adt( 7 "AHelloWho", 8 [])], 9 [],

31 10 {},origin=|project://UsageExamples/src/Usage.rsc|(243,6,<10,33>,<10,39>)) 11 ])[ 12 @__nodeId="280e62ca-3b60-4f09-8a48-a4a15b096ea9", 13 @__resugarFunction=func( 14 value(), 15 [adt( 16 "AExp", 17 [])], 18 [], 19 {},origin=|project://UsageExamples/src/Usage.rsc|(301,11,<11,48>,<11,59>)) 20 ]

Notice that all (sub)terms are annotated with the two annotations we described earlier. If we remove these annotations, we can see that the resulting transformation is equivalent to the output transformation in our definition:

1 rascal>delAnnotationsRec(desugar-all) 2 AExp: aHello(aWorld())

We can succesfully resugar this term to the surface level representation:

1 rascal> resugar-all> 2 Exp: ‘Hello, World!‘

Unfortunately, however, this will not work using intermediate desugarings due to typing issues we discuss in Chapter 6.

5.7.4 When-Conditions

In the following example, we desugar ‘Hello, {Name ”,”}*! ‘ into an AST representation. This requires the concrete names to be translated into strings. We can do this using when- conditions, which are called before desugaring:

1 syntax Exp = "Hello," {Name ","}* names "!"; 2 data DExp = hello(list[str] names); 3 4 @ensureUnchanged{outNames} 5 DExp sugar((Exp)‘Hello,<{Name ","}* names>!‘) ⇒ hello(outNames) 6 when outNames := ["" | n ← names];

Here, we collect the concrete names and translate them into a list of strings using:

1 when outNames := ["" | n ← names];

As such, if we desugar (and remove the annotations of) the concrete expression: ‘Hello, World, Rascal!‘, we get:

1 rascal>delAnnotationsRec(desugar-all) 2 DExp: hello(["World","Rascal"])

Whereas resugaring gives us:

32 1 rascal>resugar-all> 2 Exp: ‘Hello, World, Rascal!‘

Note that the variables in the when-conditions need to be constant. For example, if we would change ”Rascal” in the AST representation using the substitution operator:

1 rascal> s = "Readers"; 2 rascal>delAnnotationsRec(resugar-all>) 3 DExp: hello(["World","Readers"])

The term is not resugared. Let us break down what we just executed: 1. We set s to ”Readers”; 2. We desugared the term (Exp)‘Hello, World, Rascal!‘ using desugar-all; 3. We then substituted the second element of the list in the desugared term by s using the substitution operator; 4. We then resugared this term (which failed and therefore yielded the original term); 5. Finally, we removed the annotations of the failed resugaring. If we change s back to ”Rascal”, however, this will succesfully resugar:

1 rascal>s = "Rascal" 2 rascal>resugar-all> 3 Exp: ‘Hello, World, Rascal!‘

33 Chapter 6

Implementation and Observations

In this chapter, we explain what we observed during the construction of our prototype and what decisions and trade-offs we had to make for our design. The goal of this chapter is to provide a better understanding of why we made certain design decisions.

6.1 Initial Design and Implementation

Our initial prototype was based on the paper ”Resugaring: Lifting Evaluation Sequences through Syntactic Sugar”[1]. We implemented their desugaring and resugaring techniques into Rascal using Rascal’s internal pattern matching algorithm. For substitution, we tailored our own algorithm that traverses a match result, i.e. a pattern matching a term, that re- places the variables with variables in the environment in the term when the (sub)pattern is a variable. We extended the syntax of Rascal to support the definition of resugaring transformation rules. Since Rascal did not have a pattern datatype available to the user, we used overloaded functions to represent transformation rules. Note that this meant that a user could add trans- formation rules that are incompatible with the three properties defined in the paper. However, this turned out to be quite useful as we will illustrate later throughout this chapter. For our initial implementation, we ignored Body-tags (tags indicating whether or not a term originates from desugaring), since this prototype was sufficient to evaluate whether or not this algorithm would be sufficiently expressive and efficient for our goals. We decided to utilize Rascal’s annotations to represent Head-tags (tags indicating how desugaring took place), since annotations are ”transparent” datatypes. With this we mean that programs are generally oblivious to annotations unless they are specifically targeted, and techniques such as pattern matching ignore annotations completely.1

1Implementation detail: we could not use ”keyword parameters” instead of annotations, since concrete syntaxes use annotations and keyword parameters can not be used in conjunction with annotations, which would mean that we could not use concrete syntaxes. Currently Rascal is transitioning away entirely from annotations, but this problem is present in the current version. When this transition is done, it should be fairly easy to replace our use of annotations by keyword parameters.

34 6.1.1 Evaluation

To test our initial implementation, we ported the Fun programming language by Jacob Ander- sen and Claus Brabrand into Rascal. This language was among several case studies designed to demonstrate the capabilities of the Banana Algebra. The Banana Algebra is an algebra that was designed to extend programming languages syn- tactically using constructive catamorphisms[19]. Central to the Banana Algebra is that dif- ferent levels of abstractions can be composed to form a language that transforms into another representation. As such, the Fun language was built by stacking different layers of syntactic sugar (abstrac- tions) on top of each other. Since optimization of composed terms was not part of their design, this led to a programming language with a huge output transformation. For instance, the expression ‘1 + 1‘ desugared into a semantically equivalent Lambda Calculus application of over 6416 characters long. When we evaluated our prototype using this programming language, we found that just desugaring the expression ‘1 + 1‘ took more than 2 minutes. Although it may not be very common that desugaring leads to term expansions over 2138 times the size of the original term, larger programs with smaller expansions would obviously incline similar performance issues. Clearly, our implementation had not met the efficiency goal. Even worse: resugaring failed!

6.1.2 Retrospect

Performance

In an attempt to improve the performance, we searched our programs for performance bot- tlenecks, but found very few blocking ones. We then studied the techniques of the paper and found the bottleneck: for each subterm in the input term, an expansion is attempted after which the original term is replaced by the desugared term in case expansion succeeded, and then desugaring is applied again on the entire desugared tree. Thus, the problem seemed to be related to the number of pattern matches and/or the performance of pattern matching. As for the performance of pattern matching in Rascal, Rascal already had a mechanism in place to ignore patterns that would definitely not match specific terms. While it may have been possible to improve the performance, no easy gains seem to could have been made here.

Resugaring Failure

We also found that the cause that a term would not resugar was related to typing. To illustrate what went wrong, consider the following example. Say we have a term T = (Exp)‘1+1‘. Now, say we have the following transformation rules, which transform addition and the number ‘1‘ to their respective Lambda Calculus representation:

1.( Exp)‘+‘ → (Lambda)‘(λmnfx.mf(nfx))()()‘

35 2.( Exp)‘1‘ → (Lambda)‘λfx.fx‘ If we desugar this term, we get the following (intermediate) results: 1.( Exp)‘1 + 1‘ 2.( Lambda)‘(λmnfx.mf(nfx))(1)(1)‘ 3.( Lambda)‘(λmnfx.mf(nfx))(λfx.fx)(1)‘ 4.( Lambda)‘(λmnfx.mf(nfx))(λfx.fx)(λfx.fx)‘ Now, if we resugar this exact same term, we should get: 1.( Lambda)‘(λmnfx.mf(nfx))(λfx.fx)(λfx.fx)‘ 2.( Lambda)‘(λmnfx.mf(nfx))(λfx.fx)(1)‘ 3.( Lambda)‘(λmnfx.mf(nfx))(1)(1)‘ 4.( Exp)‘1 + 1‘ However, the fourth step fails, since the right-hand side of the first rule matches matches with: (Lambda)‘(λmnfx.mf(nfx))()()‘ While the term is of the form:

(Lambda)‘(λmnfx.mf(nfx))()()‘

This situation was caused by the fact that no types were checked during the synthetization of the terms (note that we get a similar case in the second step of desugaring). Types are checked, however, during pattern matching. We identified two different straight-forward ways to solve this problem: 1. ”Break” the pattern matching mechanism to accept terms during resugaring that have an invalid type; type-check after resugaring is done. 2. Add type checking to value synthetization. The problem with the first solution was that we would get ambiguous match results. To illustrate this, say we have the pattern ‘{T ype1 ”, ”} + ts1, {T ype2 ”, ”} + ts2‘ which accepts one or more values of T ype1 values followed by one or more T ype2 values, separated by a comma. Now say we match the term ‘1, 1, 2, 2‘ to this pattern during resugaring, where 1 is of T ype1 and 2 is of T ype2. When this term matches the pattern using our ”broken” mechanism, this would simulatenously induce the environments: •{ t1s = [(T ype1)‘1‘, (T ype1)‘1‘, (T ype2)‘2‘]), t2s = [(T ype2)‘2‘]} •{ t1s = [(T ype1)‘1‘, (T ype1)‘1‘]), t2s = [(T ype2)‘2‘, (T ype2)‘2‘]} •{ t1s = [(T ype1)‘1‘], t2s = [(T ype1)‘1‘, (T ype2)‘2‘, (T ype2)‘2‘]} which would thus be ambiguous. This left us with the second solution. However, this solution would come with a very expensive price: using this solution, we would be restricted to transformations for which the output type

36 had to be equivalent to the input type. Furthermore, the second option would not fix the performance issues that seemed to be related to this technique. As such, this was not an option at all.

6.2 Compositional Desugaring

To fix the problems we described earlier, we proceeded to implement the techniques described in the second paper on resugaring by J. Pombrio and S. Krishnamurthi[2]. The techniques described in this paper seemed promising, as they allowed for the use of Turing-complete expansion functions and because desugaring proceeded in a compositional fashion (which would induce a better time complexity). However, there was a problem that had led us to utilize their techniques in a different way than they were originally prescribed. The problem with this approach was that the Turing-complete expansion functions had to utilize a ”pattern” datatype. Unfortunately, Rascal did not have a pattern datatype available to their users, which meant that we could not simply integrate their techniques. We were essentially left with three options. First, we could break abstraction and integrate this datatype into Rascal (which was highly unfeasible). Second, we could tailor our own pattern language and tools using abstract datatypes or syntax definitions (which would not integrate well with the rest of Rascal but would not break abstraction). Finally, we could also simulate the pattern datatype using functions, in a similar way as our initial implemen- tation. Parameters of functions allow for pattern matching in Rascal. As such, this allowed us to accept terms that adhere to a certain pattern. However, as the patterns declared in the parameters were constant, they could not be generated or transformed in ways like terms, which could be a serious limitation. In order to overcome this limitation, we had to integrate additional functionality in other ways, as we will illustrate later throughout this chapter. We chose this option as it seemed to be the most elegant and viable one.

6.2.1 Compositional Desugaring Transformation

We altered our design to support expansion and unexpansion through the use of Rascal functions, similar to how we integrated the previous techniques2. In contrast to the expansion and unexpansion mechanisms described in the papers, however, our functions would call desugaring and resugaring on their subterms themselves, instead of the underlying algorithm. This way, we could integrate both (and other) techniques in the same environment in the form of individual transformations, generalize our desugaring and resugaring algorithms and eventually research their differences. Furthermore, we would gain the benefit of having a single interface for different techniques which could seamlessly interact with each other. Using compositional desugaring, we believed that we could fix the issues related to the pre- vious techniques. Our theory was that this technique could greatly reduce the number of

2From here onwards, we refer to the first transformation technique as ”intermediate desugaring” and the second transformation technique as ”compositional desugaring”.

37 pattern matches required to desugar and resugar terms and would allow for transformations in which the output has a different type than the input. As such, we re-evaluated the same program using compositional desugaring, and found that the term that took multiple minutes to desugar would desugar and succesfully resugar in less than 150 milliseconds.

6.2.2 Limitations

Since this technique desugared terms in a compositional fashion, this required that terms would immediately transform into their core-level representation. As such, we had to rewrite transformation rules to produce terms in the core language. For example:

1 letrec a = b in c ⇒ let a = Y(b) in c 2 let a = b in c ⇒ (λa.c) b 3 Y(a) ⇒ (λxy.y(xxy))(λxy.y(xxy))a

Had to be rewritten to:

1 letrec a = b in c ⇒ (λa.c) ((λxy.y(xxy))(λxy.y(xxy))b) 2 let a = b in c ⇒ (λa.c) b 3 Y(a) ⇒ (λxy.y(xxy))(λxy.y(xxy))a

Some terms, however, could not immediately be transformed into their core-level represen- tation using a simple ”generalized” compositional transformation rule. We found this to be the case in transformation rules containing ellipses, since this would induce an innumerable amount of transformation rules. To illustrate this, consider desugaring keyword parameters in a Javascript-like language, a case similar to keyword parameters conversion in M. Heimensen and T. van der Storm’s ES6 to ES5 transpiler[20]. For brevity, we simply denote pattern variables using a $-prefix:

1 ‘function($x*, $y = $a, $z*) { $body }‘ 2 ⇒ 3 ‘function($x*, $y, $z*) { $y = $y == null ? $a:$y; $body }‘

For instance, when we would desugar:

1 function(x = 1, y = 2, z = 3) { 2 }

This would result in the intermediate3 term:

1 function(x, y = 2, z = 3) { 2 x = null(x) ? 1 :x; 3 }

While what we wanted was: 3By intermediate we mean that the term is not yet fully translated into the core-level representation; hence the name ”intermediate desugaring” which we named after its purpose.

38 1 function(x, y, z) { 2 x = null(x) ? 1 :x; 3 y = null(x) ? 2 :y; 4 z = null(x) ? 3 :z; 5 }

6.2.3 Desugaring Intermediate Core Terms

We identified three different solutions for desugaring intermediate core terms using composi- tional desugaring: 1. After desugaring, repeat the desugaring process. 2. After desugaring, check if the term re-matches the original pattern. If this is the case, repeat desugaring. 3. Combine compositional desugaring with intermediate desugaring. The problem with the first approach was that this would induce the same time complexity and typing problem as the original algorithm: when a term could not be desugared, every subterm would be desugared until it could not be desugared anymore. The problem with the second approach was that a term may have to be transformed in a different way after re-desugaring. The third option, however, would induce a more favorable outcome. Since intermediate desugaring immediately re-desugars a term after an expansion, this meant that if this succeeding desugaring is then captured by a compositional desugaring, this would give us the advantage of desugaring without the performance and typing issues related to our original approach. To illustrate this, let us look back at our previous example, in which we tried to desugar:

1 function($x*, $y = $a, $z*) { $body } 2 ⇒ 3 function($x*, $y, $z*) { $y = $y == null ? $a:$y; $body }

Using both techniques, we can now translate this rule into two different rules. The first rule uses intermediate desugaring to ”normalize” the pattern (we denote intermediate desugarings using ⇒∗):

1 function($x*, $y = $a, $z*) { $body } ∗ 2 ⇒ 3 function($x*, $y, $z*) { $y = $y == null ? $a:$y; $body }

The second rule now uses compositional desugaring (denoted with ⇒) to quickly desugar the term and proceed with the core pattern’s variables. Note that this transformation is a so-called ”identity transformation”, and is used to quickly jump over the terms that do not require desugaring.4

4Identity transformations are not a necessity: by default, an expansion is attempted on every subterm. However, compositional identity transformations may be used to increase performance: the larger the output term, the greater the performance increase.

39 1 function($x*, $y, $z*) { $body } 2 ⇒ 3 function($x*, $y, $z*) { $body }

6.3 Matching Ambiguity

Another problem we faced was that transformations with ellipses sometimes led to an incorrect resugaring. For example, desugaring and resugaring the term:

1 ‘function(x, y = 2, z) { }‘

Would yield:

1 ‘function(x = 2, y, z‘) { }‘

This problem was unrelated to the techniques we used for resugaring. Instead, this problem was caused by ”matching ambiguity”, since the ellipses were flattened after being trans- formed. In this context we mean by ”flattened” that subterms originating from ellipses were transformed without preserving information about the original structure of the ellipses. As a result, the variables $x, $y and $z were ambiguously matched to different overlapping subterms: •{ x → [], y → x, z → [y, z], a → 2 } •{ x → [x], y → y, z → [z], a → 2 } •{ x → [x, y], y → z, z → [], a → 2 } We identified four different possible solutions to address this problem: 1. Optionally track the origin of non-ellipses in a pattern with more than one ellipsis; 2. Optionally track the origin of all terms in ellipses; 3. Optionally fix the lengths of ellipses after desugaring; 4. Restrict our implementation to allow only one ellipsis in a pattern. As for considering the fourth option, it must be noted that we sought a generally correct solution. Although the first option could have been an appropriate solution for the ambiguity problem for converting ES6 to ES5 as described in Chapter 8.2, this solution will not work in the following example:

1 x*, y* ⇒ x* y*

Here, the matching algorithm on the core representation will still fail, since there are only ellipses present. The second alternative would work in this case, but in that case, it would not possible to add a term to an ellipsis after desugaring, since this term would then remain untracked. We eventually chose the third option in which we track the lengths of ellipsis. This technique would not incline the problems that the other techniques suffer from. Furthermore, this solution would not require the core language to take variable tracking into account.

40 6.4 Sugar Contexts

The desugaring mechanisms described earlier do not account for different sugar contexts. By this we mean that different desugaring transformations may be appropriate for different situations. One instance in which we found this to be necessary was when we added support for resugaring to M. Heimensen’s and T. van der Storm’s ES6 to ES5 desugaring application[20], in which syntactic sugar of the form:

1 function (...) { } was transformed into:

1 (function(_this,_arguments) { 2 return function() { 3 4 }; 5 })(this, undefined)

Since the terms this and arguments were used as the first two parameters of the function in the desugared term, the this and arguments parameters in the body had to be replaced by this and arguments respectively. However, this and arguments occurences in function bodies inside the body of this function no longer had to be replaced. Since sugar transformations were named functions, this already meant that a single Rascal file could contain multiple desugaring mechanisms. Therefore, we came up with a simple solution for this problem, which involved allowing sugar functions to ”break out” to other functions. This meant that for intermediate desugarings, the name of a function may be provided that is called after a single iteration of desugaring. Analogous to the respective desugaring strategy, for compositional desugarings this meant that we would allow the user to specify break-out functions for each of the core pattern’s variables. By adding support for different ”sugar contexts”, we were able to solve the problem in the ES6 to ES5 case study by breaking out to another sugar function when a function’s body had to be desugared:

1 Expression arrowSugar((Expression)‘() =\> { }‘ | body -> arrowSugarInner) 2 ⇒ (Expression)‘(function(_this,_arguments) { 3 ’ return function() { 4 ’ 5 ’ }; 6 ’})(this, undefined)‘; 7 8 // Break-out functions during body desugaring. 9 Expression arrowSugarInner((Expression)‘this‘) ⇒ (Expression)‘_this‘; 10 11 Expression arrowSugarInner((Expression)‘arguments‘) ⇒ (Expression)‘_arguments‘; 12 13 Expression arrowSugarInner((Statement)‘‘ | f -> arrowSugar) 14 ⇒ (Expression)‘‘;

41 15 16 Expression arrowSugarInner((Expression)‘‘ | f -> arrowSugar) 17 ⇒ (Expression)‘‘;

6.5 Beyond Substitution-Based Desugaring

During another case study of M. Heimensen and T. van der Storm’s ES6 to ES5 desug- aring application[20], we observed the need for other transformations rather than simple substitution-based transformations. ES6 allows the use of varargs-parameters in a function declaration of the form function(var1,...,varn, ...rest) {}. To transform this into ES5, a for-loop was introduced that accumulated the arguments after the n-th input variable. In order to determine where this for-loop had to start, the desugared term thus needed to contain the starting index of this for-loop. To solve this specific case, we identified two solutions: 1. Allow for ”custom” Rascal functions, shifting the responsibility of correctness to the user; 2. Allow the use of when-conditions for sugar function declarations, similar to expression function declarations. when-conditions in Rascal are conditions that expression functions can use to allow a func- tion to be executed if - and only if - the conditions in the when-conditions are met. However, since pattern matching is allowed in when-conditions, this also meant that a pattern could be matched to a specific value which would bind the variables in the pattern to the environ- ment. As such, not only could we make sugar function declarations more expressive by allowing conditionality, but this also allowed external variables to be ”injected” into the surface or core pattern/expression, without affecting the ”correctness” properties of the sugar functions5. Since the injections happen prior to the execution of a sugar function during desugaring, this technique essentially simulates dynamic pattern generation. Nevertheless, as we saw in the previous chapter, we decided to support both techniques. Although sugar function declarations with optional when-conditions are sufficient for most use cases, in some cases it may still be useful for users to utilize normal Rascal functions. We especially found this to be necessary when we were prototyping new desugaring techniques and when we wanted to generate debugging information during a transformation. When the user is capable of using custom functions correctly, they may be of great value.

6.6 Fallback Resugaring Functions

Another challenge we faced was that a term that could not be resugared may still represent a value that the sugar author intended to be resugared. For example, if the user desugars 2 + 2

5This only holds when the user appropriately prepends sugar functions using the @fixedLength tags, as we discuss in Chapter 9.

42 into the Lambda Calculus, then the output after evaluation may not be resugared, while it is obvious that the user may expect 4. We found this to be extremely important when we were building a stepper through a functional programming that desugars into the Lambda Calculus. To illustrate this, let us take a closer look at the following example:

1 plus(2, 2) 2 ⇒desugaring (λmnfx.mf(nfx))(λfx.f(fx))(λfx.f(fx)) 3 ⇒β−reduction (λnfx.(λfx.f(fx))f(nfx))(λfx.f(fx)) 4 ⇒β−reduction λfx.(λfx.f(fx))f((λfx.f(fx))fx) 5 ⇒β−reduction λfx.(λx.f(fx))((λfx.f(fx))fx) 6 ⇒β−reduction λfx.f(f((λfx.f(fx))fx)) 7 ⇒β−reduction λfx.f(f((λx.f(fx))x)) 8 ⇒β−reduction λfx.f(f(f(fx)))

Note that the final result could only be resugared if we allowed for some kind of ”fallback” mechanism. The resugaring in J. Pombrio and S. Krishnamurthi’s first paper[1] on resugaring allowed for terms in the core level representation that were not resugared to be visible in the surface level representation if the sugar author intended so (even though this would break their abstraction property). Their solution, however, was fairly limited: although terms originating from the core language may be represented in the surface level representation, no resugaring could take place, leaving the user with the core-level representation of the term. We introduce a more elaborate technique, which we call the ”fallback resugaring mechanism”. This technique uses a mechanism analogous to try-catch-blocks, in which a sugar function throws an exception which is then ”caught” by (calling) a function. We designed two ways in which ”resugar exceptions” can be caught. First, we allow custom functions to catch a resugar exception by declaring the function with the name of the exception as its first argument and the offending term as its second argument. Second, we allow sugar function declarations to use throws and catch keywords in their declaration, that behave in a similar way as normal try-catch blocks.

6.7 Annotations

During the evaluation of the Lambda Calculus evaluator, we found that it is sometimes necessary to be able to alter the origin information of terms. Furthermore, for defining custom functions, it is necessary to have the same degree of control that other transformations have. As such, we kept using Rascal’s annotations to annotate terms. Since this is intentional (there is really no other reason why we use annotations), we do not consider this to be a leaky abstraction. Annotations can easily be removed from terms, and terms can easily be displayed without annotations. As such, we believe that it is of great value that annotations are accessible to the user, and that the disadvantages are minimal.6

6Implementation Detail: Annotations can easily be replaced by anything else, such as an internally defined datatype or keyword parameters. The process of attaching origin information to terms can be found in the org.rascalmpl.interpreter.sugar.SugarParameters class.

43 6.8 Node Identity

Finally, we needed to ensure that terms originating from the core evaluation sequence are hidden in the evaluation sequence in order to adhere to the Abstraction property. The first paper describes the use of Body-tags that are attached to desugared nodes. If, after resug- aring, there are any Body tags still present, this means that resugaring has failed since the terms that have these tags originate from the core language. In the second paper, however, every node is tagged with a unique identifier to ensure that all nodes in a resugaring originate from the same desugaring. Since the performance impact of this technique is relatively small and this technique could solve potential inconsistencies, we decided to attach unique node identities to nodes instead.

44 Chapter 7

Formal Justification Sketch

In Chapter 5, we looked at how our implementation was designed from a user perspective. In this chapter, we will sketch how the formal justification of that design may look like, which accounts for the algorithms and techniques we used. We leave the actual proving for future work. The techniques we use are similar to J. Pombrio and S. Krishnamurthi’s techniques[1][2]. However, as we allow for different types of transformations and custom sugar functions that do not return patterns, this required us to rewrite the desugaring and resugaring functions to allow for the use of arbitrary functions that adhere to the Abstraction, Emulation and Coverage properties. As this impacted nearly the entire design, we want to illustrate that our design not just ”looks” correct, but is formally correct as well.

7.1 Properties

Central to our design, we specified the following properties. We borrowed these properties from the papers by Pombrio and Krishnamurthi. Most properties were borrowed from the first paper, since they are more generalized in nature, whereas the properties in the second paper are more specific to their techniques. The first property, however, originates from the second paper, since it is essentially an extension of the respective property in the first paper. • Emulation Every surface term desugars to (a term isomorphic to) the core term it purports to represent. - [2] • Abstraction Code introduced by desugaring is never revealed in the surface evaluation sequence, and code originating from the original input program is never hidden by resugaring. - [1] • Coverage Resugaring is attempted on every core step, and as few core steps are skipped as possible. - [1] We do not prove any hygiene-related properties, since hygiene is outside the scope of this thesis.

45 D(f, t) traverse t and desugar t whenever possible using function map f. (t) traverse t and resugar t whenever possible. d(f, t) desugar t using function map f. r(t) resugar t. D?(f, t) > if D(f, t) does not fail, ⊥ otherwise. R?(t) > if R(t) does not fail, ⊥ otherwise. d?(f, t) > if d(f, t) does not fail ⊥ otherwise. r?(t) > if r(t) does not fail, ⊥ otherwise. {v1 → f1, ..., vn → fn} function map {v1 → t1, ..., tn → fn} variable map σ variable map t term P pattern a atom > true ⊥ false λ lambda function

Table 7.1: Overview of definitions

7.2 Outline

In Chapter 7.3, we introduce the formal definition of our design. Using our Formal definition, we prove the following properties in Chapter 7.4: 1. Resugaring is isomorphic, iff the desugaring functions are isomorphic; 2. Intermediate desugarings are isomorphic; 3. Compositional desugarings are isomorphic; 4. Resugaring and desugaring produce a core term iff the desugaring functions produce a core term; 5. Intermediate desugarings produce a core term; 6. Compositional desugarings produce a core term; Finally, in Chapter 7.5 and 7.6 we discuss why our design adheres to the Abstraction and Coverage properties respectively.

7.3 Formal Definition

In this section, we will formally specify our design, which we use to sketch the justification of our specifications in the following sections. Each formal definition is mapped in some way to the design in Chapter 5, as we discuss along the way. Note that we do not make any assumptions about matching and substitution, other than (((t/P1)P2)/P2)P1 = t unless t/P1 fails (where / is the matching operator that produces a variable map σ and σ P means that

46 the variables in the map σ are applied on P , producing a term for which the variables in P are substituted by the variables in σ). We also do not make assumptions about the term language: we abstract the term language away by only using nodes for inductively defined datatypes and constants for atoms. The reason for these abstractions is that we want to prove that our design is correct, independent of the respective datatypes and internal mechanisms in Rascal itself. Furthermore, note that we use Lambda functions (denoted as λxs.y, for parameters xs and body y). We define f as the map from patterns to available functions, constants as a, and nodes using node(t1, ..., tn). t is used to denote terms. Finally, we use the helper functions d(f, t) and r(t) that perform respectively a single desugaring and resugaring operation on a term. d(f, t) first looks for an appropriate function in the function map f and returns the body of that function applied by t. r(t) simply executes the first argument in the Origin tag. d?(f, t) and r?(t) are used to determine whether desugaring or resugaring is possible for the provided term using function map f. Finally, true and false are denoted using > and ⊥ respectively. An overview of all the definitions we use is illustrated in Figure 7.1. We will now look at the formal definition of our design. Note that desugaring does not allow for Origin tags, meaning that for most desugarings (except for identity transformations) R(R(D(f, D(f, t)))) is invalid (for D is the main desugaring function and R is the main resugaring function). However, we propose a solution for layered desugaring and resugarings in Chapter 9. Furthermore, note that we do not embed Body tags (tags that are used to track the origin of a node) in our proofs for the purpose of simplifying and improving their brevity. However, we discuss Abstraction (for which Body-tags are necessary) in Chapter 7.5. We start by defining the desugar function D as follows:

D(f, a) = a (7.1) ( d?(f, node(t1, ..., tn)) d(f, node(t1, ..., tn)) D(f, node(t1, ..., tn)) = (7.2) ¬d?(f, n) node(D(f, t1), ..., D(f, tn)) (7.3)

D(f, t) corresponds with the desugar-all function defined in Chapter 5, where f points to overloaded Rascal functions and t is the term to desugar. We define the resugaring function R(t) as follows:

R(a) = a (7.4)

R(node(t1, ..., tn)) = node(R(t1), ..., R(tn)) (7.5) ( r?((Origin f, node(t1, ..., tn))) r((Origin f, node(t1, ..., tn))) R((Origin f, node(t1, ..., tn))) = ¬r?((Origin f, node(t1, ..., tn))) fail (7.6)

R(t) corresponds to the resugar-all function defined in Chapter 5, where t is the term to desugar. Finally, we define the transformations we use as follows, where the first transformation is the intermediate desugaring transformation and the second transformation is the compositional

47 desugaring transformation:

∗ 0 0 (P1 ⇒ P2 | f)(t) = (Origin (λt .(R(t )/P2)P1),D(f, (t/P1)P2)) (7.7) 0 0 0 (P1 ⇒ P2 | {v1 → f1, ..., vn → fn})(t) = (Origin (λt .R (t /P2)P1), (7.8) 0 D ({v1 → f1, ..., vn → fn}, t/P1)P2) (7.9)

where D0 and R0 are helper functions for desugaring and resugaring over a variable to value map:

0 D ({v1 → f1, ..., vn → fn}, {v1 → t1, ..., vn → tn}) = {v1 → D(f1, t1), ..., vn → D(fn, tn)} (7.10) 0 0 0 0 0 R ({v1 → t1, ..., vn → tn}) = {v1 → R(t1), ..., vn → R(tn)} (7.11)

The former transformation function corresponds to the intermediate desugar function defined in 5.2.1 and the latter corresponds to the transformation function defined in 5.2.2. Note that the syntax of the functions differ in three ways: 1. We omit the name, visibility, type, throws-catch definitions, tags and when- conditions in the formal definition. name essentially refers to the key in the map f, whereas visibility essentially defines whether name is in f. tags are used for ellipsis variables, but we abstracted the pattern matching system away and are therefore irrel- evant. The same goes for type. when-conditions essentially alter P1 and P2 in the transformations. Finally, throws-catch definitions deliberately break the three re- sugaring properties. As such, we leave them out of our formal specifications, since there is nothing to ”prove” about these definitions in relation to our justification context. 2. In our formal definition, we require every break-out function to be specified. In our design, however, we omit caller-referring break-out functions. Furthermore, the syn- tax allows them to be entirely omitted (including the ”|” token in intermediate sugar function definitions and the ”→” token in intermediate sugar function definitions in our design. 3. Instead of ”→”, we use ”|” in our formal definition for the break-out function in inter- mediate sugar function declarations for brevity. Finally, note that we omit custom functions in our definitions, as they are simply Rascal functions that can be used in f. However, we do show that every custom function can be used that adheres to the Emulation properties and is isomorphic using Theorems 1 and 4, iff they adhere to the Abstraction and Coverage properties in the same way as described in 7.5 and 7.6.

7.4 Emulation

Recall the Emulation property: Emulation Every surface term desugars to (a term isomorphic to) the core term it purports to represent. - [2]

48 It follows from this textual description that we have to prove that terms are isomorphic after desugaring and that terms desugar into ”core terms” they are meant to represent. As such, the first step to prove this property is to prove that our core mechanisms desugar terms into terms that are ”isomorphic”.[2].

Theorem 1. Assume that for arbitrary f, r(d(f, t)) = t, and d?(f, t) ↔ r?(d(f, t))). Then: R(D(f, t)) = t for arbitrary f and t.

Proof. We prove by induction. Base case: R(D(f, a)) ⇒ R(a) (by definition 7.1) ⇒ a (by definition 7.4)

Induction step: Assume R(D(f, t1)) = t1, ..., R(D(f, tn)) = tn. Then it must hold that R(D(node(t1, ..., tn))) = node(t1, ..., tn).

Proof: if d?(node(t1, ..., tn)) then: R(D(f, node(t1, ..., tn))) ⇒ R(d(f, node(t1, ..., tn))) (by definition 7.2) ⇒ r(d(f, node(t1, ..., tn))) (since d?(f, t) → r?(d(f, t))) ⇒ node(t1, ..., tn) (since r(d(f, t)) = t) if ¬d?(node(t1, ..., tn)) R(D(f, node(t1, ..., tn))) ⇒ R(node(D(f, t1), ..., D(f, tn))) (by definition 7.5) ⇒ node(R(D(f, t1)), ..., R(D(f, tn))) (by definition 7.2) ⇒ node(t1, ..., tn) (since R(D(f, t1)) = t1, ..., R(D(f, tn)) = tn.)

Since we have shown R(D(f, t)) holds for the base case and the induction step, by mathemat- ical induction it now holds that R(D(f, t1)) for r(d(f, t)) = t, and d?(f, t) ↔ r?(d(f, t))).

We now want to show that our transformation rules are isomorphic, since our desugaring and resugaring functions depend on them. We define d(f, t) as the function that finds a function in function list f that can desugar t. Similarly, r(t) is the function that applies the function in the Origin tag’s first argument to the term wrapped into the tag’s the second argument. We start with intermediate desugarings. Note that the paper on which intermediate desugar- ings are based already proves this property, but only for situations in which this transformation is the only possible transformation.

Theorem 2. Assume that R(D(f, t)) for r(d(f, t)) = t, and d?(f, t) ↔ r?(d(f, t))). Then: ∗ ∗ r(d({..., P1 → (P1 ⇒ P2 | f2), ...}, t)) = t for d(f1, t) = (P1 ⇒ P2 | f2)

∗ Proof. r(d({..., P1 → (P1 ⇒ P2 | f2), ...}, t))

49 ∗ ⇒ r((P1 ⇒ P2 | f2)(t)) (by definition of d) 0 0 ⇒ r((Origin (λt .(R(t )/P2)P1),D(f, (t/P1)P2))) (by definition 7.7) 0 0 ⇒ (λt .(R(t )/P2)P1)(D(f, (t/P1)P2)) (by definition of r) ⇒ (R(D(f, (t/P1)P2))/P2)P1 (by reduction of λ) ⇒ t (since (((t/P1)P2)/P2)P1 = t and R(D(f, t)) = t)

Next, we show that compositional desugarings are isomorphic.

Theorem 3. Assume that R(D(f, t)) for r(d(f, t)) = t, and d?(f, t) ↔ r?(d(f, t))). Then: ∗ r(d({..., P1 → (P1 ⇒ P2 | {v1 → f1, ..., vn → fn}), ...}, t)) = t for d(f1, t) = (P1 ⇒ P2 | f2)

Proof. r(d({..., P1 → (P1 ⇒ P2 | {v1 → f1, ..., vn → fn}), ...}, t)) 0 0 0 0 ⇒ r((Origin (λt .R (t /P2)P1),D ({v1 → f1, ..., vn → fn}, t/P1)P2)(t)) (by def. d and 7.9) 0 0 0 0 ⇒ (λt .R (t /P2)P1)(D ({v1 → f1, ..., vn → fn}, t/P1)P2) (by definition of r) 0 0 ⇒ R ((D ({v1 → f1, ..., vn → fn}, t/P1)P2)/P2)P1 (by reduction of λ) ⇒ t (since (((t/P1)P2)/P2)P1 = t and R(D(f, t)) = t, iff R(D(f, vars)) = vars

Thus we need to prove that R(D(f, vars)) = vars.

Proof:

0 0 R (D ({v1 → f1, ..., vn → fn}, {v1 → t1, ..., vn → tn})) 0 ⇒ R ({v1 → D(f1, t1), ..., vn → D(fn, tn)}) (by definition 7.10) ⇒ {v1 → R(D(f1, t1)), ..., vn → R(D(fn, tn))} (by definition 7.11) ⇒ {v1 → t1, ..., vn → tn} (by Theorem 1)

We have now shown that our transformation rules are isomorphic. As such, we must now prove that each term desugars into ”the core term it purports to represent.”[2]

As such, we must now show that desugar(ts) = tc for ts is a(n arbitrary) surface term and tc is a(n arbitrary) core term. We first assume that d(f, ts) = tc, and show that our desugaring algorithm adheres to this property (Theorem 4). Furthermore, we assume that ¬d?(f, t) iff t is not a core term. Finally, we assume that transformation rules do not use a constant as their root term (similar to Pombrio and Krishnamurthi’s techniques).

Theorem 4. D(f, ts) = tc for ts is a surface term and tc is a core term.

Proof. D(f, a) = a (which is a core term.)

If d?(f, node(t1, ..., tn)), then: D(f, node(t1, ..., tn)) ⇒ d(f, node(t1, ..., tn)) (which is a core term since d(f, t) is a core term.)

If ¬d?(f, node(t1, ..., tn)), then: D(f, node(t1, ..., tn))

50 ⇒ node(D(f, t1), ..., D(f, tn)) (node is a core term, since ¬d?(f, node(t1, ..., tn)).)

Since D(f, t1), ...D(f, tn), it follows by induction that D(f, ts) = tc.

Next, we must prove that this property holds for our transformation rules. Note that the second transformation rule is constrained in that P2 must be a core term, as discussed in Chapter 5. Recall that the transformation rules are defined as:

∗ 0 0 (P1 ⇒ P2 | f)(t) = (Origin (λt .(R(t )/P2)P1),D(f, (t/P1)P2)) 0 0 0 (P1 ⇒ P2 | {v1 → f1, ..., vn → fn})(t) = (Origin (λt .R (t /P2)P1), 0 D ({v1 → f1, ..., vn → fn}, t/P1)P2)

The next step is to show that this will also hold for the transformations (Theorems 5 and 6).

Theorem 5. d(f, ts) = tc for ts is a surface term and tc is a core term for d(f, ts) = ∗ (P1 ⇒ P2 | f)(ts).

∗ Proof. d(f, ts) = (P1 ⇒ P2 | f)(ts) 0 0 ⇒ (Origin (λt .(R(t )/P2)P1),D(f, (ts/P1)P2)) (by definition 7.7) 0 0 0 0 ⇒ (Origin (λt .(R(t )/P2)P1), d ) (where d = D(f, (t/P1)P2))

0 d = D(f, (t/P1)P2) 0 ⇒ d = D(f(t”)) (for t” = (t/P1)P2) 0 ⇒ d = tc (since D(f, ts) = tc) 0 0 0 ⇒ d(f, t) = (Origin (λt .(R(t )/P2)P1), tc) (by re-embedding d )

Since Origin is not a term but a tag, we can thus conclude that d(f, ts) represents a core term.

Theorem 6. d(f, ts) = tc for ts is a surface term and tc is a core term for d(f, ts) = (P1 ⇒ P2 | {v1 → f1, ..., vn → fn})(ts).

Proof. d(f, ts) = (P1 ⇒ P2 | {v1 → f1, ..., vn → fn})(ts) 0 0 0 0 ⇒ (Origin (λt .R (t /P2)P1),D ({v1 → f1, ..., vn → fn}, t/P1)P2) 0 0 0 0 0 0 ⇒ (Origin (λt .R (t /P2)P1), d ) (for d = D ({v1 → f1, ..., vn → fn}, t/P1)P2)

0 0 d = D ({v1 → f1, ..., vn → fn}, t/P1)P2 0 0 d = σcP2 (for σc = D ({v1 → f1, ..., vn → fn}, t/P1)P2) 0 d = tc (since P2 was restricted to a core pattern. Iff σc only contains desugared variables.)

Thus we need to prove that D0(f, σ) returns a map of desugared variables. Proof:

0 Since D ({v1 → f1, ..., vn → fn}, {v1 → t1, ..., vn → tn}) = {v1 → D(f1, t1), ..., vn → D(fn, tn), all variables in this map are called using D, and D always returns a core term by Theorem 4, it therefore follows that all variables in the map are desugared into their core representation.

51 Since Origin is not a term but a tag, we can thus conclude that d(f, ts) represents a core term.

Provided that we had split the textual description of the Emulation into six different theorems and that we have sketched their validity, we have now illustrated that the Emulation property holds for our design.

7.5 Abstraction

Recall the Abstraction property (cited): Abstraction Code introduced by desugaring is never revealed in the surface evaluation sequence, and code originating from the original input program is never hidden by resugaring. - [1] In our formal specification, we have not yet introduced measures for adhering to the Abstrac- tion property. We now describe how we adhere to the Abstraction property using Body-tags similar to J. Pombrio and S. Krishnamurthi’s Body-tags. Note that the desugaring and resugaring techniques described earlier are oblivious to Body-tags. Our approach using Body-tags and inverse annotations is very similar to J. Pombrio and S. Krishnamurthi’s first resugaring approach[1]. To adhere to the first part of this property, when a term t is desugared using pattern P , we tag each node that was transformed with a Body tag that contains the term t and a unique identifier, by calling tagId on the terms that are matched to a pattern, as shown in Definition 7.12 and 7.13. Note that we use m to denote terms that are matched to pattern P .

tagId(a) := (Body a uniqueId()) (7.12)

tagId(node(n1, ..., ny)) := (Body node(tagId(n1), ..., tagId(ny)) uniqueId()) (7.13) accIds(m) := [i | (Body t i) ← m] (7.14) postMatch(m, h) := hash(accIds(m)) = h (7.15)

We then accumulate the ids in m using accIds as shown in Definition 7.14. When a term is matched against a pattern during resugaring, we then check if that term’s identifiers are the same using postMatch(m, h), where h is the hashed version of the original term, as shown in Definition 7.15. This is to ensure that terms are not wrongly captured during resugaring. Whenever resugaring is completed, it will either have failed due to pattern matching con- straints (in which case we cannot show the resugared term at all), or it will be checked that the output does not contain any Body-tags. If there are no Body-tags, this means that no terms originated from desugared code, since all desugared nodes were tagged with Body-tags. The second part of this property refers to code that is wrongly resugared. For example, if there is a desugaring from ‘not(a and b)‘ → ‘not(a) or not(b)‘, then if the surface language states ‘not(a) or not(b)‘, then the core level representation never resugars to ‘not(a and b)‘.

52 This does not happen in our implementation, since a term is only resugared iff there is an inverse transformation function attached to a node. Since our design adheres to both parts of the Abstraction property, we can thus conclude that our design adheres to the Abstraction property. However, what we ”forgot” to mention is the support for ”resugaring fallback”, as discussed in Chapter 6. Although we believe there are perfectly good reasons for utilizing resugaring fallbacks, they purposely break abstraction. As it is their purpose to break the Abstraction property, we explicitly ignore this use case for our formal justification sketch.

7.6 Coverage

Recall the Coverage property (cited): Coverage Resugaring is attempted on every core step, and as few core steps are skipped as possible.[1] This refers to an evaluation stepper. However, our resugaring algorithm always resugars whenever it is possible. It only fails whenever a transformation is not possible, or when the Abstraction property is not met. As such, we adhere to this property.

53 Chapter 8

Evaluation

In this chapter, we discuss the evaluation of our design and corresponding prototype using different case studies.

8.1 Expressiveness Evaluation: Fun to Lambda Calculus

In this case study, we demonstrate how we can desugar the functional programming language Fun1 into the Lambda Calculus, step through the evaluation sequence and resugar inter- mediate terms whenever possible using our prototype. This programming language features numbers, booleans, variables, functions, several operations on numbers and booleans, pairs, conditionals and recursion. For the full source code, see Appendix D.4.

8.1.1 Syntax Definition

We started by implementing the syntax of Fun in Rascal. The syntax of this language is shown in Figure 8.1 and the transformation rules are shown in Figure 8.2. Our program accepts the same input, but uses an abstract syntax representation for the Lambda Calculus to avoid syntactical challenges during the evaluation of Lambda Calculus expressions:

1 data NameData = id(str id); 2 3 data LambdaData 4 = var(NameData n) 5 | app(LambdaData l1, LambdaData l2) 6 | app1(LambdaData l1, LambdaData l2) 7 | lambda(NameData name, LambdaData lambdaData) 8 ;

Note that we define two different ”types” of Lambda applications, which are semantically equivalent to the application operation in the Lambda Calculus, but differ in the way they track the origin of core terms. app will, after application of app, set the origin annotations

1From: http://cs.au.dk/ mis/lambda.pdf

54 Figure 8.1: Syntax of Fun. Screenshot captured from: http://cs.au.dk/ mis/lambda.pdf of the applied app to its child Lambda expression, while app1 does not. app can be used for example for calculating plus(1, 1) with fallback resugaring to numbers, while app1 can be used for example for let-expressions which are not relevant anymore after they are applied.

8.1.2 Church Numerals

The next step was to implement support for desugaring and resugaring numbers. Recall that numbers in the Lambda Calculus are represented by so-called Church Numerals (see Table 8.1). To desugar and resugar these numbers, we built a custom function that takes a number

Number Church Encoding 0 λfx.x 1 λfx.fx 2 λfx.f(fx) 3 λfx.f(f(fx)) n λfx.f nx

Table 8.1: Church Numerals

in the surface level representation and produces the corresponding Lambda-expression with an inverse function.

1 LambdaData sugar(original:(Exp)‘‘) { 2 LambdaData fs = lambda(id("f"), lambda(id("x"), createFsx(toInt("")))); 3 return fs[@__resugarFunction=(Exp) (lambda(id("f"), lambda(id("x"), LambdaData fsx))) { 4 if (canUnFsx(fsx)) { 5 Integer iResult = parse(#Integer, ""); 6 return (Exp)‘‘ <<< original; 7 } 8 fail; 9 }];

55 Figure 8.2: Transformation rules of Fun. Screenshots captured from: http://cs.au.dk/ mis- /lambda.pdf

10 } 11 12 int unFsx(app(var(id("f")), LambdaData inner)) = 1 + unFsx(inner); 13 int unFsx(var(id("x"))) = 0;

Note that we could also have written this using when-conditions. Finally, we wanted to catch MaybeNumber-s that are thrown when a term that could represent a number after evaluation, but could not be resugared. As such, we defined the fallback resugaring method for numbers as:

1 Exp sugar("MaybeNumber", lambda(id("f"), lambda(id("x"), LambdaData fsx))) { 2 if (canUnFsx(fsx)) { 3 Integer iResult = parse(#Integer, ""); 4 return (Exp)‘‘; 5 } 6 fail; 7 }

8.1.3 Matching-And-Substitution-Based Transformations

The next step was to create simple matching-and-substition-based desugaring transformations for each of the language’s surface constructs. These are straight-forward, and are of the form:

56 1 // ⇒ (Lambda)‘\\xy.x‘; 2 LambdaData sugar((Exp)‘true‘) catch MaybeBoolean 3 ⇒ lambda(x, lambda(y, var(x))) 4 when x := name("x"), y := name("y");

This is the transformation rule of true which catches MaybeBoolean, to allow the resug- aring of functions that reduce to a boolean. Similarly, the transformation of plus (which throws MaybeInteger) is defined as:

1 // ⇒ (Lambda)‘(\\mnfx.mf(nfx))()()‘; 2 LambdaData sugar((Exp)‘plus (, )‘) throws MaybeInteger 3 ⇒ app(app(lambda(m, lambda(n, lambda(f, lambda(x, 4 app( 5 app(var(m), var(f)), 6 app(app(var(n), var(f)), 7 var(x))))))), LambdaData e1), LambdaData e2) 8 when f := name("f"), x := name("x"), m := name("m"), n := name("n");

Note that we often use when-conditions for names, to make the actual Lambda Expression shorter and easier to read. name(name) is simply a function that returns id(name) (repre- senting a variable in the Lambda Calculus).

8.1.4 Evaluator

The Lambda Calculus Evaluator contains a public function evalShowSteps that desugars its input expression, applies alpha renaming on the input and steps through the desugared input. When a step succesfully resugars, the stepper displays the result in terms of the sur- face language. Stepping through Lambda code is done using the call-by-name [21], in which functions are called without evaluating their arguments first (compa- rable to, but simpler than Haskell’s mechanism, which uses the call-by-need strategy that has memoization).

8.1.5 Usage Examples

Using the evaluator, we can now step through the Fun language with the Lambda Calculus evaluator as its core interpreter. We will now show some examples and the output produced after interpretation.

Let and Arithmetics

Given the very simple program which calculates y · (x + 1) for x, y = 2:

1 let y = 2 in let x = 2 in mult(y, plus(x,1)) we get the following evaluation sequence after evaluation:

1 let y = 2 in let x = 2 in mult(y, plus(x,1)) 2 let x = 2 in mult(2, plus(x,1))

57 3 mult(2, plus(2,1)) 4 6 whereas the final value is represented by the Lambda Calculus expression (correctly corre- sponding to Church Numeral 6)2:

1 (\f.(\x.(f (f (f (f (f (f x))))))))

Fibonacci

Using recursion, we can also define a simple Fibonacci calculator.

1 let fib(x, y, n, fib) = 2 let z = plus(x, y) in 3 if (iszero(n)) z 4 else cons(z, fib(y, z, pred(n), fib)) 5 in let fibStart(n) = cons(0, cons(1, fib(0, 1, pred(pred(pred(n))), fib))) 6 in fibStart(8)

This program produces the following surface evaluation sequence:

1 let fib(x, y, n, fib) = let z = plus(x, y) in if (iszero(n)) z else cons(z, fib( y, z, pred(n), fib)) in let fibStart(n) = cons(0, cons(1, fib(0, 1, pred( pred(pred(n))), fib))) in fibStart(8) 2 cons(0, cons(1, y, n, fib) = let z = plus(0, y) in if (iszero(n)) z else cons(z, fib(y, z, pred(n), fib)), 1, pred(pred(pred(8))), x, y, n, fib) = let z = plus(x, y) in if (iszero(n)) z else cons(z, fib(y, z, pred(n), fib))))) 3 cons(0, cons(1, n, fib) = let z = plus(0, 1) in if (iszero(n)) z else cons(z, fib(1, z, pred(n), fib)), pred(pred(pred(8))), x, y, n, fib) = let z = plus( x, y) in if (iszero(n)) z else cons(z, fib(y, z, pred(n), fib))))) 4 cons(0, cons(1, fib) = let z = plus(0, 1) in if (iszero(pred(pred(pred(8))))) z else cons(z, fib(1, z, pred(pred(pred(pred(8)))), fib)), x, y, n, fib) = let z = plus(x, y) in if (iszero(n)) z else cons(z, fib(y, z, pred(n), fib) )))) 5 cons(0, cons(1, cons(1, y, n, fib) = let z = plus(1, y) in if (iszero(n)) z else cons(z, fib(y, z, pred(n), fib)), plus(0, 1), pred(pred(pred(pred(8)))), x, y, n, fib) = let z = plus(x, y) in if (iszero(n)) z else cons(z, fib(y, z, pred(n), fib)))))) 6 cons(0, cons(1, cons(1, n, fib) = let z = plus(1, plus(0, 1)) in if (iszero(n)) z else cons(z, fib(plus(0, 1), z, pred(n), fib)), pred(pred(pred(pred(8)))) , x, y, n, fib) = let z = plus(x, y) in if (iszero(n)) z else cons(z, fib(y , z, pred(n), fib)))))) 7 cons(0, cons(1, cons(1, fib) = let z = plus(1, plus(0, 1)) in if (iszero(pred( pred(pred(pred(8)))))) z else cons(z, fib(plus(0, 1), z, pred(pred(pred( pred(pred(8))))), fib)), x, y, n, fib) = let z = plus(x, y) in if (iszero(n) ) z else cons(z, fib(y, z, pred(n), fib)))))) 8 cons(0, cons(1, cons(1, cons(2, y, n, fib) = let z = plus(plus(0, 1), y) in if ( iszero(n)) z else cons(z, fib(y, z, pred(n), fib)), plus(1, plus(0, 1)), pred(pred(pred(pred(pred(8))))), x, y, n, fib) = let z = plus(x, y) in if ( iszero(n)) z else cons(z, fib(y, z, pred(n), fib)))))))

2Note that we represent Lambda functions using \and that we show every paren as a means to avoid confusion about possible ambiguities.

58 9 cons(0, cons(1, cons(1, cons(2, n, fib) = let z = plus(plus(0, 1), plus(1, plus (0, 1))) in if (iszero(n)) z else cons(z, fib(plus(1, plus(0, 1)), z, pred( n), fib)), pred(pred(pred(pred(pred(8))))), x, y, n, fib) = let z = plus(x, y) in if (iszero(n)) z else cons(z, fib(y, z, pred(n), fib))))))) 10 cons(0, cons(1, cons(1, cons(2, fib) = let z = plus(plus(0, 1), plus(1, plus(0, 1))) in if (iszero(pred(pred(pred(pred(pred(8))))))) z else cons(z, fib( plus(1, plus(0, 1)), z, pred(pred(pred(pred(pred(pred(8)))))), fib)), x, y, n, fib) = let z = plus(x, y) in if (iszero(n)) z else cons(z, fib(y, z, pred(n), fib))))))) 11 cons(0, cons(1, cons(1, cons(2, cons(3, y, n, fib) = let z = plus(plus(1, plus (0, 1)), y) in if (iszero(n)) z else cons(z, fib(y, z, pred(n), fib)), plus (plus(0, 1), plus(1, plus(0, 1))), pred(pred(pred(pred(pred(pred(8)))))), x, y, n, fib) = let z = plus(x, y) in if (iszero(n)) z else cons(z, fib(y, z, pred(n), fib)))))))) 12 cons(0, cons(1, cons(1, cons(2, cons(3, n, fib) = let z = plus(plus(1, plus(0, 1)), plus(plus(0, 1), plus(1, plus(0, 1)))) in if (iszero(n)) z else cons(z , fib(plus(plus(0, 1), plus(1, plus(0, 1))), z, pred(n), fib)), pred(pred( pred(pred(pred(pred(8)))))), x, y, n, fib) = let z = plus(x, y) in if ( iszero(n)) z else cons(z, fib(y, z, pred(n), fib)))))))) 13 cons(0, cons(1, cons(1, cons(2, cons(3, fib) = let z = plus(plus(1, plus(0, 1)), plus(plus(0, 1), plus(1, plus(0, 1)))) in if (iszero(pred(pred(pred(pred( pred(pred(8)))))))) z else cons(z, fib(plus(plus(0, 1), plus(1, plus(0, 1)) ), z, pred(pred(pred(pred(pred(pred(pred(8))))))), fib)), x, y, n, fib) = let z = plus(x, y) in if (iszero(n)) z else cons(z, fib(y, z, pred(n), fib) ))))))) 14 cons(0, cons(1, cons(1, cons(2, cons(3, cons(5, y, n, fib) = let z = plus(plus( plus(0, 1), plus(1, plus(0, 1))), y) in if (iszero(n)) z else cons(z, fib(y , z, pred(n), fib)), plus(plus(1, plus(0, 1)), plus(plus(0, 1), plus(1, plus (0, 1)))), pred(pred(pred(pred(pred(pred(pred(8))))))), x, y, n, fib) = let z = plus(x, y) in if (iszero(n)) z else cons(z, fib(y, z, pred(n), fib))))) )))) 15 cons(0, cons(1, cons(1, cons(2, cons(3, cons(5, n, fib) = let z = plus(plus(plus (0, 1), plus(1, plus(0, 1))), plus(plus(1, plus(0, 1)), plus(plus(0, 1), plus(1, plus(0, 1))))) in if (iszero(n)) z else cons(z, fib(plus(plus(1, plus(0, 1)), plus(plus(0, 1), plus(1, plus(0, 1)))), z, pred(n), fib)), pred (pred(pred(pred(pred(pred(pred(8))))))), x, y, n, fib) = let z = plus(x, y) in if (iszero(n)) z else cons(z, fib(y, z, pred(n), fib))))))))) 16 cons(0, cons(1, cons(1, cons(2, cons(3, cons(5, fib) = let z = plus(plus(plus(0, 1), plus(1, plus(0, 1))), plus(plus(1, plus(0, 1)), plus(plus(0, 1), plus (1, plus(0, 1))))) in if (iszero(pred(pred(pred(pred(pred(pred(pred(8))))))) )) z else cons(z, fib(plus(plus(1, plus(0, 1)), plus(plus(0, 1), plus(1, plus(0, 1)))), z, pred(pred(pred(pred(pred(pred(pred(pred(8)))))))), fib)), x, y, n, fib) = let z = plus(x, y) in if (iszero(n)) z else cons(z, fib(y, z, pred(n), fib))))))))) 17 cons(0, cons(1, cons(1, cons(2, cons(3, cons(5, cons(8, y, n, fib) = let z = plus(plus(plus(1, plus(0, 1)), plus(plus(0, 1), plus(1, plus(0, 1)))), y) in if (iszero(n)) z else cons(z, fib(y, z, pred(n), fib)), plus(plus(plus(0, 1), plus(1, plus(0, 1))), plus(plus(1, plus(0, 1)), plus(plus(0, 1), plus(1, plus(0, 1))))), pred(pred(pred(pred(pred(pred(pred(pred(8)))))))), x, y, n, fib) = let z = plus(x, y) in if (iszero(n)) z else cons(z, fib(y, z, pred( n), fib)))))))))) 18 cons(0, cons(1, cons(1, cons(2, cons(3, cons(5, cons(8, n, fib) = let z = plus( plus(plus(1, plus(0, 1)), plus(plus(0, 1), plus(1, plus(0, 1)))), plus(plus(

59 plus(0, 1), plus(1, plus(0, 1))), plus(plus(1, plus(0, 1)), plus(plus(0, 1), plus(1, plus(0, 1)))))) in if (iszero(n)) z else cons(z, fib(plus(plus( plus(0, 1), plus(1, plus(0, 1))), plus(plus(1, plus(0, 1)), plus(plus(0, 1), plus(1, plus(0, 1))))), z, pred(n), fib)), pred(pred(pred(pred(pred(pred( pred(pred(8)))))))), x, y, n, fib) = let z = plus(x, y) in if (iszero(n)) z else cons(z, fib(y, z, pred(n), fib)))))))))) 19 cons(0, cons(1, cons(1, cons(2, cons(3, cons(5, cons(8, fib) = let z = plus(plus (plus(1, plus(0, 1)), plus(plus(0, 1), plus(1, plus(0, 1)))), plus(plus(plus (0, 1), plus(1, plus(0, 1))), plus(plus(1, plus(0, 1)), plus(plus(0, 1), plus(1, plus(0, 1)))))) in if (iszero(pred(pred(pred(pred(pred(pred(pred( pred(8)))))))))) z else cons(z, fib(plus(plus(plus(0, 1), plus(1, plus(0, 1))), plus(plus(1, plus(0, 1)), plus(plus(0, 1), plus(1, plus(0, 1))))), z, pred(pred(pred(pred(pred(pred(pred(pred(pred(8))))))))), fib)), x, y, n, fib ) = let z = plus(x, y) in if (iszero(n)) z else cons(z, fib(y, z, pred(n), fib)))))))))) 20 cons(0, cons(1, cons(1, cons(2, cons(3, cons(5, cons(8, plus(plus(plus(1, plus (0, 1)), plus(plus(0, 1), plus(1, plus(0, 1)))), plus(plus(plus(0, 1), plus (1, plus(0, 1))), plus(plus(1, plus(0, 1)), plus(plus(0, 1), plus(1, plus(0, 1))))))))))))) 21 cons(0, cons(1, cons(1, cons(2, cons(3, cons(5, cons(8, 13)))))))

During the evaluation, there were 766 core evaluation steps of which 21 could be succesfully resugared. The final value on the core level was represented by the following Lambda Calculus expression:

1 (\.(( (\f.(\x.x))) (\.(( (\f.(\x.(f x)))) (\.((< pair> (\f.(\x.(f x)))) (\.(( (\f.(\x.(f (f x))))) (\.((< pair> (\f.(\x.(f (f (f x)))))) (\.(( (\f.(\x.(f (f (f (f (f x))) ))))) (\.(( (\f.(\x.(f (f (f (f (f (f (f (f x))))))))))) (\f.(\x .(f (f (f (f (f (f (f (f (f (f (f (f (f x)))))))))))))))))))))))))))))

Factorial

Another example that uses recursion is that of a program that calculates the factorial of a number N.

1 letrec mul(x, y) = 2 if (iszero(pred(y))) x 3 else plus(x, mul(x, pred(y))) 4 in letrec fac(n) = 5 if (iszero(pred(n))) n 6 else mul(n, fac(pred(n))) 7 in fac(N)

Using N = 3, this program produces 6, whereas with N = 4, the program produces 24.

8.1.6 Conclusion

With this case study, we have demonstrated how the techniques described throughout this thesis can be used and that the techniques described in this paper are sufficiently expressive

60 to create a stepper for a simple language on top of the Lambda Calculus.

8.2 Expressiveness Evaluation: ES6 to ES5

In this case study, we will demonstrate how we can desugar and resugar ES (ECMAScript) 6 to and from ES 5 respectively using our prototype, as a means to gather evidence that our solution works in real-world situations. Note that we do not step through the code as was the case with the Fun evaluator (although it is most certainly possible [22]). This study is based on the work of M. Heimensen and T. van der Storm3, who devised a desugaring application for desugaring ES 6 to ES 5.[20] We examine how we can add support for resugaring to their implementation for three categories of syntactic sugar: 1. ES6 Arrow Functions to (ES5) Functions 2. ES6 Rest and Keyword Parameters to (ES5) Functions; 3. ES6 Binary and Octal Numbers to (ES5) Decimal Numbers.

8.2.1 Arrows to Functions

M. Heimensen and T. van der Storm[20] demonstrated how arrow functions in ES6 can be desugared into semantically equivalent functions in ES5. Arrow functions are functions that lexically bind this variables, after which they cannot be altered. The approach from Heimensen and van der Storm was to convert arrow functions into functions that bind - this variables, and calling that function to generate the required function. The body of the function is then altered to replace this occurences with this and arguments with arguments. We define these transformations as follows:

1 Expression arrowSugar((Expression)‘() =\> { }‘ | body -> arrowSugarInner) 2 ⇒ (Expression)‘(function(_this,_arguments) { 3 ’ return function() { 4 ’ 5 ’ }; 6 ’})(this, undefined)‘; 7 8 Expression arrowSugar((Expression)‘ =\> { }‘ | body -> arrowSugarInner) 9 ⇒ (Expression)‘(function(_this,_arguments) { 10 ’ return function() { 11 ’ 12 ’ }; 13 ’})(this, undefined)‘; 14 15 Expression arrowSugar((Expression)‘() =\> ‘ | body -> arrowSugarInner) 16 ⇒ (Expression)‘(function(_this,_arguments) { 17 ’ return function() {

3The original source code is available at: https://github.com/matthisk/RMonia

61 18 ’ return ; 19 ’ }; 20 ’})(this, undefined)‘; 21 22 Expression arrowSugar((Expression)‘ =\> ‘ | body -> arrowSugarInner) 23 ⇒ (Expression)‘(function(_this,_arguments) { 24 ’ return function() { 25 ’ return ; 26 ’ }; 27 ’})(this, undefined)‘;

Note that we use break-out functions in our transformations to arrowSugarInner. ar- rowSugarInner replaces occurences of arguments and this to arguments and this respectively, until it finds a nested function:

1 Expression arrowSugarInner((Expression)‘this‘) ⇒ (Expression)‘_this‘; 2 3 Expression arrowSugarInner((Expression)‘arguments‘) ⇒ (Expression)‘_arguments‘; 4 5 Expression arrowSugarInner((Statement)‘‘ | f -> arrowSugar) 6 ⇒ (Expression)‘‘; 7 8 Expression arrowSugarInner((Expression)‘‘ | f -> arrowSugar) 9 ⇒ (Expression)‘‘; 10 11 Expression arrowSugarInner(e:(Expression)‘() =\> { }‘) 12 = arrowSugar(e); 13 14 Expression arrowSugarInner(e:(Expression)‘ =\> { }‘) 15 = arrowSugar(e); 16 17 Expression arrowSugarInner(e:(Expression)‘() =\> ‘) 18 = arrowSugar(e); 19 20 Expression arrowSugarInner((Expression)‘ =\> ‘) 21 = arrowSugar(e);

Using these definitions, we were able to succesfully desugar and resugar the following exam- ples (on the left-hand side of the following blocks we show the surface-level representation and we show the desugared term on the right-hand side):

1 (function(_this,_arguments) { 2 return function(a) { 1 a ⇒ b; 3 return b; 4 }; 5 })(this, undefined);

62 1 (function(_this,_arguments) { 2 return function(a, b) { 1 (a, b) ⇒ c; 3 return c; 4 }; 5 })(this, undefined);

1 (function(_this,_arguments) { 2 return function(a) { 1 a ⇒ this; 3 return _this; 4 }; 5 })(this, undefined);

Note that this is replaced by this.

1 (function(_this,_arguments) { 2 return function(a) { 1 a ⇒ function(x) { this; }; 3 return function(x) { this; }; 4 }; 5 })(this, undefined);

Note that this is not replaced by this, since it is placed in an embedded function.

8.2.2 Binary and Octal Numbers to Decimal Numbers

The next case we studied was desugaring ES6’s binary and octal numbers to ES5’s decimal numbers. As we had already demonstrated in the Fun to Lambda Calculus case study how we can do this using custom functions, we will now demonstrate how this can be done using when-conditions. We used the function convertFromBase defined by M. Heimensen and T. van der Storm to transform a number from another base to a decimal number:

1 private int convertFromBase( int base, str input ) { 2 int s = 0; 3 int n = size(input); 4 for( str d ← split("",input) ) { 5 n -= 1; 6 s += toInt(d) * floor( pow(base,n) ); 7 } 8 return s; 9 }

We then defined a helper function that takes the concrete number and two integers that denote the length of the prefix (”0b” and ”0o”) of the number and the base, calls convertFrom- Base to change the number representation and finally translates the outcome to a concrete Number.

1 private Numeric numberSugar( Numeric n, int begin, int base ) {

63 2 return ([Numeric]"", begin))>"); 3 }

Finally, we defined the transformations, which are trivial using when-conditions:

1 Expression sugar( (Expression)‘‘ ) 2 ⇒ (Expression)‘‘ 3 when out := numberSugar((Numeric)‘‘, 2, 2); 4 5 Expression sugar( (Expression)‘‘ ) 6 ⇒ (Expression)‘‘ 7 when out := numberSugar((Numeric)‘‘, 2, 8);

Using this implementation, we can now desugar and resugar octal and binary numbers ac- cordingly, as we illustrate in the following examples:

1 0b10110 1 22

1 function() { return 0b10110; }; 1 function() { return 22; };

1 0o88 1 72

1 function() { return 0o88; }; 1 function() { return 72; };

8.2.3 Rest and Keyword Parameters

Finally, we demonstrate how rest and keyword parameters can be desugared and resugared. Functions with keyword and rest parameters are respectively functions of the form func- tion(x, y...) and function(a = b, c = d, e). As the source code treats six similar but slightly different situations and is therefore quite extensive, we demonstrate two key examples for each of the function types. The full source code can be found in Appendix D.1. The first example is that of desugaring rest parameters with additional parameters.

1 Function functionSugar((Function)‘function (<{Param ","}* ps>, ...) { < Statement* body> }‘) 2 * ⇒ (Function)‘function( <{Param ","}* ps> ) { 3 ’ 4 ’ 5 ’}‘ 6 when Statement initBody := spreadParameter( rest, size( (Params)‘<{Param ","}* ps>‘ ), generateUId );

The transformation itself is straight forward. Rest parameters are removed, and an initial- ization body is generated using the helper function spreadParameter that accumulates

64 the rest parameters by iterating over the arguments. spreadParameter is defined as follows:

1 private Statement spreadParameter( Id param, int pos, Id(str) generateUId ) 2 = restInit 3 when 4 Id len := generateUId("_len"), 5 Expression key := [Expression]"", 6 Statement restInit := (Statement)‘for (var = arguments.length, = Array( \> 1 ? - 1 :0 ), _key = ; _key \< ; _key++) { 7 ’ [_key - 1] = arguments[_key]; 8 ’ }‘;

This function utilizes a unique identifier generator, which is used to set a variable in which the length of the arguments is stored ( len). This function is defined as:

1 Id(str) generateNamer() { 2 set[str] names = {}; 3 Id generateUniqueId( str name ) { 4 if( name in names ) name = gensym(names, name); 5 names += name; 6 return [Id]name; 7 } 8 return generateUniqueId; 9 }

Note that we have not altered the unique identifier generator. The generateUId variable is set to generateUId = generateNamer(); each time desugaring takes place. Next, we look at an example for keyword parameters. This example is notable because we had to fix the length of the ellipsis variables to ensure that no match ambiguity can occur, as we discussed in Chapter 6.

1 @fixedLength{bef} 2 @fixedLength{rest} 3 Function functionSugar((Function)‘function (<{Param ","}* bef>, = < Expression defVal>, <{Param ","}* rest>) { }‘) 4 * ⇒ (Function)‘function(<{Param ","}* bef>, , <{Param ","}* rest>) { 5 ’ 6 ’ 7 ’}‘ 8 when Statement initBody := defaultParameter( pr, defVal, size((Params)‘<{Param ","}* bef>‘) );

Here, defaultParameter is defined as follows:

1 private Statement defaultParameter( Id param, Expression defVal, int pos ) 2 = defInit 3 when 4 Expression pos := [Expression]"", 5 Statement defInit := (Statement)‘var = arguments[] === undefined ? :arguments[];‘;

65 This function generates ES5 code to check that a parameter is set or sets that parameter to its default value otherwise. We used the @fixedLength tag to notify the resugaring operation that the length of bef and rest are fixed.

8.2.4 Conclusion

We have now demonstrated that ES6 can succesfully desugar and resugar to ES5 for three distinct cases. Nevertheless, the case studies heavily impacted our design and prototype, and more case studies may eventually lead to additional extensions.

8.3 Performance Benchmark

In this case study, we compare the performance of the intermediate and compositional trans- formation techniques.

8.3.1 Setup

We defined a simple concrete language and 8 sugar functions that expand a-s to a number of b-s. The syntax is defined in Rascal as follows:

1 syntax Exp = "a" Exp | "b" Exp | ".";

The 8 sugar functions are defined as follows. Note that the only difference between sugarn transformations and sugarIntn transformations is that the former use compositional desug- arings, whereas the latter use intermediate desugarings.

1 Exp sugar1((Exp)‘a‘) ⇒ (Exp)‘b‘; 2 Exp sugar4((Exp)‘a‘) ⇒ (Exp)‘bbbb‘; 3 Exp sugar16((Exp)‘a‘) ⇒ (Exp)‘bbbbbbbbbbbbbbbb‘; 4 Exp sugar64((Exp)‘a‘) ⇒ (Exp)‘ bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb‘; 5 Exp sugarInt1((Exp)‘a‘) * ⇒ (Exp)‘b‘; 6 Exp sugarInt4((Exp)‘a‘) * ⇒ (Exp)‘bbbb‘; 7 Exp sugarInt16((Exp)‘a‘) * ⇒ (Exp)‘bbbbbbbbbbbbbbbb‘; 8 Exp sugarInt64((Exp)‘a‘) * ⇒ (Exp)‘ bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb‘;

Next, we defined two functions that desugar and resugar the input using the different tech- niques:

1 private void testCompositional(Exp exp) { 2 if (resugar-all> != exp) 3 throw "Unequivalent terms!"; 4 if (resugar-all> != exp) 5 throw "Unequivalent terms!"; 6 if (resugar-all> != exp) 7 throw "Unequivalent terms!"; 8 if (resugar-all> != exp)

66 9 throw "Unequivalent terms!"; 10 } 11 12 private void testIntermediate(Exp exp) { 13 if (resugar-all> != exp) 14 throw "Unequivalent terms!"; 15 if (resugar-all> != exp) 16 throw "Unequivalent terms!"; 17 if (resugar-all> != exp) 18 throw "Unequivalent terms!"; 19 if (resugar-all> != exp) 20 throw "Unequivalent terms!"; 21 }

Finally, we defined two functions that run testCompositional and testIntermediate for 1, 4, 16 and 24 a-s.

1 public void testCompositional() { 2 testCompositional((Exp)‘a.‘); 3 testCompositional((Exp)‘aaaa.‘); 4 testCompositional((Exp)‘aaaaaaaaaaaaaaaa.‘); 5 testCompositional((Exp)‘aaaaaaaaaaaaaaaaaaaaaaaa.‘); 6 } 7 public void testIntermediate() { 8 testIntermediate((Exp)‘a.‘); 9 testIntermediate((Exp)‘aaaa.‘); 10 testIntermediate((Exp)‘aaaaaaaaaaaaaaaa.‘); 11 testIntermediate((Exp)‘aaaaaaaaaaaaaaaaaaaaaaaa.‘); 12 }

The desugaring and resugaring commands were adjusted to repeat every desugaring and resugaring 1000 times, and calculate the median of these results. We did not print any timing-related information within the benchmark, but in Rascal’s source code itself. This way, we could avoid other processes from participating in the results, other than the actual desugaring and resugaring iterations. The tests were run normally in Eclipse (not in Debug Mode) using a Lenovo X220 with an i7-2640M 2.8GHz processor, 4GB of RAM and 4GB of swap memory on a 128GB SSD drive, running on Linux Mint 17.2 Cinnamon Edition.

8.3.2 Results and Conclusion

As we can see in Figure 8.2, desugaring and resugaring using our compositional techniques took less time to desugar and resugar than the intermediate techniques for every case. In the worst case, we only got a performance increase of a factor 2.05, whereas in the best case, we got a performance increase of a factor 7.1. Although we ran our tests over 1000 times and calculated the median to obtain our results, the results in Figure 8.3 are a bit fuzzy. There is no clear correlation between the ratio of the input versus output term size and the performance of the algorithms in contrast to what we expected. This is likely to be caused by background processes, garbage collection or other distorting factors during the execution of our tests (although we had closed all

67 Table 8.2: Benchmark Results

a ⇒ #b #(in ) #(out ) t t t t t /t t /t Tint a b d.int r.int d.comp r.comp d.int d.comp r.int r.comp Tcomp 1 1 1 778 197 139 57 5.60 3.46 4.97 4 1 4 768 250 173 85 4.44 2.94 3.95 16 1 16 2057 325 326 203 6.31 1.60 4.50 24 1 24 4619 1126 772 436 5.98 2.58 4.76 1 4 4 445 210 165 121 2.70 1.74 2.29 4 4 16 1363 731 259 193 5.26 3.79 4.63 16 4 64 4769 1235 644 464 7.41 2.66 5.42 24 4 96 21460 4703 2154 1532 9.96 3.07 7.10 1 16 16 1600 925 469 599 3.41 1.54 2.36 4 16 64 5278 1842 863 873 6.12 2.11 4.10 16 16 256 19865 5297 2391 1975 8.31 2.68 5.76 24 16 384 68759 17168 8542 6332 8.05 2.71 5.78 1 64 64 2172 1355 682 1036 3.18 1.31 2.05 4 64 256 6964 2523 1281 1467 5.44 1.72 3.45 16 64 1024 26154 7248 3576 3052 7.31 2.37 5.04 24 64 1536 103273 25932 12763 9654 8.09 2.69 5.76

Performance benchmarking results. t stands for time (in microseconds), int for intermediate desugaring, comp for compositional desugaring, d for desugaring, r for resugaring, a ⇒ #b for the number of bs the sugar function returns, #(ina) for the number of as are served as input and #(outb) for the number of bs that were found in the output, T stands for total. other desktop applications during our tests). Nevertheless, in general we see that shorter terms are processed relatively more quickly by compositional desugaring and resugaring than intermediate desugaring and resugaring.

8.4 Performance Evaluation Case Study

In this case study, we evaluate the performance of the Fun language from one of the Banana Algebra[19] examples4 into the Lambda Calculus (note that this Fun language is different from the Fun language described in Chapter 8.1). The Banana Algebra is an algebra introduced by Jacob Andersen and Claus Brabrand as a means for syntactically extending languages using constructive catamorphisms.[19] All ap- plications of the algebra can be normalized into a single catamorphism. We used the Fun language in one of the Banana Algebra examples as part of our evaluation. Catamorphisms are often written using banana brackets (| and |) [23], hence the name ”Banana Algebra”.

4See: http://www.itu.dk/people/brabrand/banana-algebra/

68 Figure 8.3: Relation between term-length and performance difference.

8 #(ina) = 1 #(ina) = 4 6 #(ina) = 16 #(ina) = 64

int 4 comp T T

2

0 1 4 8 12 16 20 24 a ⇒ #b

8.4.1 Banana Algebra Fun Language

The Fun Language was designed to demonstrate the expressiveness of the Banana Algebra. As such, the Fun language was written in Banana Algebra, which could be transformed by the Banana Algebra ”normalizer” into a single catamorphism. This normalized file had to be transpiled into Rascal in order for us to evaluate our prototype. However, either due to issues related to our syntax definition or Rascal itself, we were unable to parse the file using Rascal’s internal parsing mechanisms. As we could not use some of Rascal’s key features, we built the transpiler in Common Lisp (see Appendix C).

8.4.2 Setup

The tests were run normally in Eclipse (not in Debug Mode) using a Lenovo X220 with an i7-2640M 2.8GHz processor, 4GB of RAM and 4GB of swap memory on a 128GB SSD drive, running on Linux Mint 17.2 Cinnamon Edition. In contrast to our setup in Chapter 8.3, we ran our tests 100 times instead of 1000 times. Our tests consisted of desugaring and resugaring 10 different Fun expressions.

8.4.3 Results and Conclusion

The results are shown in Table 8.3 and Figure 8.4. What we can see in Figure 8.4 is that there seems to be a linear correlation between the size of the desugared terms and the time to process those terms: it takes about 24 microseconds to process each term. Furthermore, our prototype is able to desugar and resugar a little over 50,000 characters in just over a second. Although there seems to be a correlation between the number of terms and the time to per- form desugaring and resugaring, these results are inconclusive. We have only desugared and

69 resugared terms that are very small in comparison to the output term. However, both perfor- mance benchmarks combined do provide an indication of the performance of our algorithms, and show a promising result.

Table 8.3: Banana Algebra’s example Fun language case study results

Term Desugaring Time Resugaring Time Total Time Characters 1 24,858 16,341 41,199 1,804 1 + 1 85,172 57,566 142,738 6,416 car(cons(1,2)) 99,658 68,063 167,721 7,357 positive? 5 126,847 87,996 214,843 9,415 let fn(a) = a*2 in fn(5) 244,197 151,623 395,820 16,384 let x = 5 in x + 5 256,977 173,757 430,734 18,444 3 - 8 267,394 186,534 453,928 19,948 #t ? 5 : 10 463,220 302,314 765,534 32,236 100 641,731 413,693 1,055,424 43,620 22 * 10 + 5 - 2 753,402 472,756 1,226,158 50,271

Figure 8.4: Relation between number of characters and total time.

·106 Desugaring and Resugaring Desugaring 1 Resugaring µs

0.5 Time in

0 0 1 2 3 4 5 Output characters ·104

70 Chapter 9

Discussion

In this chapter, we discuss the research questions using the results from the previous chapters, the limitations of our prototype, our recommendations for future work and the threats to the validity of this thesis. Using these discussion topics, we conclude our thesis in the following chapter.

9.1 Research Questions

Our approach to study the research questions central to this thesis was to build a prototype based on a literature study. We tested our prototype against different case studies and gradually improved upon our prototype until it allowed us to succesfully evaluate them. When our prototype was complete, we evaluated our prototype using the cases we used to develop that artefact. Finally, we presented a number of proof sketches to argue that the design of our prototype was correct. We believe that we have now gathered sufficient evidence to answer our research questions appropriately.

9.1.1 Efficiency

Our first research question was: Are the resugaring techniques found throughout literature efficient enough for practical applications? We observed that the techniques described in the first paper by J. Pombrio and S. Krishna- murthi had fundamental problems related to their performance. The underlying reason for these problems was that the algorithm tried to expand every term one or more times, before and after desugaring. In contrast, the techniques in their second paper would only try to expand each term once. As such, we found that transforming a term that was longer than 6416 characters in length took multiple minutes to process using the techniques based on their first paper. Yet the

71 compositional techniques were capable of desugaring and resugaring that same term in ap- proximately 150 milliseconds. In fact, even a term over 50.000 characters in length took just over a second to process with the latter set of techniques. During our performance benchmark we found that, even for extremely small languages, the difference was already in the range of a factor 2-7, and that it seemed that when the output term of a transformation would increase, this difference would increase as well. As such, for the cases we studied, we believe that the performance of the techniques based on the second paper by J. Pombrio and S. Krishnamurthi on resugaring were sufficient for practical use in the contexts of our case studies. As for the techniques in the first paper, however, we believe they were not.

9.1.2 Expressivity

The second research question was: Are the resugaring techniques found throughout literature expressive enough for practical applications? What we observed was that the techniques in the first paper were fairly limited in terms of expressing syntactic sugar transformations. First of all, they were only capable of simple matching-and-substitution based desugaring, based on a fixed set of rules. Furthermore, we noticed that this technique was incapable of handling typed terms. The second paper illustrated techniques that allowed for Turing-complete functions to be used for desugaring. As such, it should be possible to desugar (almost) any kind of syntactic sugar using these techniques. However, we found that resugaring was too restricted for handling situations beyond matching-and-substitution. To overcome this problem, we introduced a resugaring fallback mechanism to resugar terms that would not resugar in contrast to the author’s intention. Using this mechanism, we were able to resugar terms that could not be resugared otherwise. Unfortunately, however, this mechanism deliberately broke the properties central to J. Pombrio and S. Krishnamurthi’s approach. In conclusion, the techniques described in the first paper by J. Pombrio and S. Krishna- murthi were not sufficiently expressive for addressing our (practical) case studies. While the techniques described in their second paper were sufficiently expressive for desugaring, for resugaring they were not.

9.1.3 Integration

Our third research question was: What are the challenges related to integrating resugaring into a readily existing meta-programming language?

72 We limited our research to integrating resugaring into Rascal. As such, some of the follow- ing challenges may apply to integrating resugaring into other meta-programming languages, whereas others are Rascal-specific. We had to relate the data structures in the papers by J. Pombrio and S. Krishnamurthi to Rascal datatypes. We chose to use annotations for origin tracking and to use Rascal’s nodes to represent terms. While the term languages in the papers were fairly limited (i.e. restricted to nodes and atoms), we were able to utilize practically every datatype in Rascal. However, since only nodes, ADTs and concrete terms were allowed to utilize annotations, we found that we could not track the origins of atoms, sets, lists and other datatypes. Finally, we could not use functions in the same way as they were used in the second paper, since Rascal did not have a pattern datatype. We solved this issue by using functions to represent the syntactic sugar transformations. All the techniques in the papers were based on pattern matching and substitution. We could use Rascal’s expressions to produce terms, which utilized a syntax very similar to patterns. However, as not every pattern could be written as an expression, and not every expression could be written as a pattern, we had to integrate additional syntax into Rascal to represent the intersection of both (with the addition of typed variables). Furthermore, we had to ensure that resugared terms would have the same tags and annotations as their original term. As such, we had to introduce our own substitution mechanism that would attach all the necessary information to the resugared term from the original term. Our next problem was to maintain abstraction. We did not want to undermine any design decisions taken by other developers. Instead, our goal was to somewhat ”strictly” extend Rascal to support resugaring. As such, one of the major trade-offs we had to make was to represent syntactic sugar transformations using functions instead of integrating a new datatype for patterns. This design decisions rippled through the entire design, which had led to the addition of extra techniques such as when-conditions and custom sugar functions in order to comply with our design goals. We wanted our algorithms to feel ”native” within the Rascal environment. As such, we had to design our syntax in a similar way as other Rascal facilities. We believe that we achieved this goal: the two integrated transformation functions have a syntax very similar to expression functions, and we allow custom sugar functions to be used to integrate any logic the user desires using Rascal constructs. Finally, the implementation itself was quite an undertaking. We had to study Rascal’s source code extensively and experiment with different techniques to produce the desired results. For instance, one of our main goals was to come up with a solution that performed well enough for real-world situations. As such, we had to constantly improve our algorithms, profile them and decrease their execution time to a minimum. In total, over 65 Rascal source code files had to be altered or created, excluding generated Java files or log files.

73 9.2 Prototype Limitations

While we paid a lot of attention to perfect our prototype, our prototype still has some impor- tant limitations within the scope of this thesis. We believe that none of the limitations that we are aware of are serious threats for the validity of our work, since many of these limitations are merely ”implementation details”. Nevertheless, it is important to acknowledge them. In this section, we discuss the most important ones.

9.2.1 Layouts

Our substitution mechanism is designed to track the layout of terms with respect to the orig- inal representation. As such, we keep track of spaces and other characters that are important to the external representation of terms, but not necessarily relevant for their semantics. Cur- rently, however, layouts of concrete ellipses are not always correctly tracked. For example, if a user executes the command:

1 resugar-all>

The prototype produces the result:

1 Function: ‘function x(x=a, y, z){;}‘

Instead of:

1 Function: ‘function x(x=a,y, z){;}‘

Notice that the layout between the original and resugared term are almost equivalent, ex- cept that the number of spaces are different between the ‘,‘ and the ‘y‘ in the function signature.

9.2.2 Fixing Ellipses’ Lengths

In order to ensure that terms matched against patterns containing ellipses are correctly re- sugared, the user needs to specify which ellipses’ lengths need to be fixed. However, there is currently no mechanism in place to notify the user that it is necessary to fix the length of certain ellipses. As such, when the user does not carefully consider which ellipses’ lengths need to be fixed, this may break the Emulation property. To illustrate this, consider the following definitions:

1 data TestData = a() | b() | c(list[TestData] datas1); 2 3 @fixedLength{xs} 4 public TestData test1(c([*TestData xs, b(), *TestData ys])) 5 ⇒ c([*TestData xs, *TestData ys]); 6 public TestData test2(c([*TestData xs, b(), *TestData ys])) 7 ⇒ c([*TestData xs, *TestData ys]);

Now, if we desugar and resugar the following term using test1 which fixes the length of ellipsis xs, we get:

74 1 c([a(), b(), a()]) 2 ⇒ (after desugaring and resugaring) 3 c([a(), b(), a()])

However, if we process this term using test2, which does not fix the length of xs, we get:

1 c([a(), b(), a()]) 2 ⇒ (after desugaring and resugaring) 3 c([b(), a() a()])

9.2.3 When-Conditions

We use when-conditions to add additional expressivity to our sugar functions. However, since when-conditions do not actually change the patterns used for transforming syntactic sugar, but instead inject the values of variables into the transformations, they are treated in the same way as other present variables. As a result, these variables may be changed in the core representation, and still succesfully resugar, which breaks the emulation property. To eliminate this problem, we added a tag @ensureUnchanged{name} that can be prepended to sugar functions to ensure that the value of name stays the same. However, this is not a very elegant solution, and no validation checks are currently present to enforce the use of these tags. As a result, when the user does not carefully consider which variables’ values need to be fixed, this may break the Emulation property1.

9.2.4 Typing-Related Resugaring Failure

Some terms may be desugared that cannot be resugared afterwards due to typing issues as discussed in Chapter 6.1.2. This is caused by the fact that no type validation is taking place during intermediate steps of term synthetization, while these checks are present during resugaring.

9.2.5 Layering Syntactic Sugar

Our current implementation does not support piping distinct layers of syntactic sugar trans- formations. By this we mean that, whenever a term and all of its subterms are fully desugared, it cannot be desugared once more (e.g. using a different set of rules), or resugaring may not work appropriately otherwise. There are a couple of technical reasons at play causing this limitation. Recall that we shifted the responsibility for desugaring subterms to the sugar functions. This means that sugar functions call desugar-all or resugar-all themselves. As such, desugaring is decentralized, meaning that there is no physical barrier separating the different layers of syntactic sugar. Since intermediate sugar functions repeat the process of resugaring multiple times, this may lead to an entanglement of different layers of syntactic sugar, which may cause unreliable output.

1For an example of this limitation, see Appendix B.5

75 A simple trick to circumvent this limitation, however, is to move the annotations found after desugaring a layer of syntactic sugar to another set of annotations before piping the result. During resugaring, then, these annotations should be restored in reverse order.

9.3 Recommendations

Our design was based on three properties from J. Pombrio and S. Krishnamurthi’s papers on resugaring. We have shown that these properties hold for most of our design decisions under a set of constraints and assumptions. Nevertheless, we deliberately ignored the fact that the resugaring fallback mechanism breaks the properties fundamental to resugaring. What we have noticed is that there is a need for such a mechanism for resugaring to be useful in some situations. In our case studies, we believe that breaking these properties was justified. In fact, J. Pombrio and S. Krishnamurthi had to break these properties in their first paper on resugaring themselves in order to obtain a satisfying result. Therefore, we believe that this is a deficiency of their properties central to resugaring, and that that it should be addressed in future work.

9.4 Threats to Validity

In this section, we identify the most important threats to the validity of this thesis. We subdivided the threats into three categories: Performance, Formal Justification and Expres- siveness.

9.4.1 Performance

While we tested our prototype for many different circumstances, bugs may still be present in our artefact. While some bugs may have led to a decrease in performance, we may have accidentally benefited from techniques that would require more processing power to process properly as well (e.g. missing validation checks, etc.). Furthermore, we have only studied our software on a single operating system and a single computer. While we may consider our techniques to be ”sufficiently performant” for practical use on a relatively modern computer, our opinions may have been different on this matter if we would have used a less performant computer. On the other hand, however, we could have considered the intermediate desugaring and resugaring techniques to be quick enough if we would have used a more powerful computer. The same argument goes for the platform we used to build our prototype upon, Rascal. Perhaps we would have found the intermediate desugaring and resugaring techniques to be sufficiently performant if we would have built our prototype on top of another platform, and vice versa for the compositional techniques. For instance, we noted that one of the main problems related to the performance of desugaring using the intermediate desugaring technique was related to the performance of Rascal’s pattern matching algorithms. However,

76 this could have been much less evident if pattern matching would have been implemented more efficiently. Probably the biggest threat to this thesis’s validity, however, is the subjectivity on the matter of performance. We have intuitively drawn conclusions about our findings on performance, but we have not clearly defined the boundaries for a system to be named ”practically usable”. We did not use any reference material to empirically substantiate these interpretations - we simply interpreted the performance results based on our own intuition. Finally, we studied the performance of our prototype using a trivial performance benchmark and a performance benchmark based on an actual programming language for a small number of cases. While this may have provided a good indication of the performance of our prototype, the number and the size of the studies we conducted on performance may be too limited to draw general conclusions from.

9.4.2 Formal Justification

We have only sketched the proof of correctness of our design. During this process, we made different assumptions to formally justify our design. For example, we took the pattern match- ing system for granted, making the formal justification much simpler. However, as we have found during the evaluation, these assumptions do not always hold. For instance, we had to restrict the lengths of ellipses to assure that our desugarings were isomorphic. Similarly, our proof sketch may simply be too scarce. We have only addressed a small portion of our design, i.e. the parts we believed to be the most fundamental. For example, we did not prove anything about our solution for fixing the lengths of ellipses or anything about when- conditions. Furthermore, we simply showed that the Abstraction and Coverage properties are essentially (restricted) forms of the properties discussed in their respective papers instead of actually addressing them formally. Our justification is just a sketch as of now, and more effort is required to finish this work. While our proof sketch provides evidence that our design is correct, and while our proof sketch may function as a good starting point for future work, it is insufficient as of yet.

9.4.3 Expressiveness

We have only studied a relatively small number of cases. In most case studies, we had to add new functionality to our prototype to support the cases we studied. Therefore, it seems as if we have only scratched the surface of the practical constraints to resugaring. However, using a different platform, we could also have encountered the opposite situation: perhaps some of the techniques we introduced may not have been necessary if we would have chosen another platform to build upon. As for the techniques we introduced, since we built our prototype based on observations from a relatively small amount of case studies, the techniques we introduced are at risk of being so specialized that they would meet, and only meet, the goals of our project.

77 Finally, we have only studied a subset of the techniques that are available for desugaring. For instance, we ignored hygiene. Needless to say, however, hygiene and other techniques are very important for desugaring (and resugaring) as well, and our results could have significantly differed if we would have considered them as well.

78 Chapter 10

Conclusion

In this thesis, we explored the practical challenges related to resugaring. We studied how the resugaring algorithms by J. Pombrio and S. Krishnamurthi behaved when they would be integrated into a readily existing programming language. We have shown what the limitations of their techniques are, and demonstrated how we could mitigate them. Finally, we evaluated their techniques using our prototype against different case studies, thereby studying their performance and expressiveness in a practical setting. For our case studies, we found that our prototype’s techniques based on the paper ”Lifting Evaluation Sequences through Syntactic Sugar” were neither efficient nor expressive enough. As for the techniques we built based on the paper ”Hygienic resugaring of compositional desugaring”, we found these techniques to be sufficiently performant, yet too constrained for resugaring terms that are altered after desugaring. Our prototype was not a one-to-one mapping from their techniques to the techniques we had used. We tried to capture their essence to form a usable artefact. As such, we had to face many obstacles to get to a satisfying result. This ultimately led to a prototype with a significantly different design than the papers it was based on, while the essence of our techniques were the same. We tried to capture the practical problems related to resugaring instead of just the theoretical ones, and have come a long way doing so. Nevertheless, it seems as if we have only scratched the surface of this very promising technique.

79 Chapter 11

References

[1] Justin Pombrio and Shriram Krishnamurthi. Resugaring: Lifting evaluation sequences through syntactic sugar. In Proceedings of the 35th ACM SIGPLAN Conference on Pro- gramming Language Design and Implementation, PLDI ’14, pages 361–371, New York, NY, USA, 2014. ACM.

[2] Justin Pombrio and Shriram Krishnamurthi. Hygienic resugaring of compositional desug- aring. In Proceedings of the 20th ACM SIGPLAN International Conference on Functional Programming, ICFP 2015, pages 75–87, New York, NY, USA, 2015. ACM.

[3] Shriram Krishnamurthi. Desugaring in practice: Opportunities and challenges. In Pro- ceedings of the 2015 Workshop on Partial Evaluation and Program Manipulation, PEPM ’15, pages 1–2, New York, NY, USA, 2015. ACM.

[4] Peter J. Landin. The mechanical evaluation of expressions. Computer Journal, 6(4):308– 320, January 1964.

[5] Eugene Kohlbecker, Daniel P. Friedman, Matthias Felleisen, and Bruce Duba. Hygienic macro expansion. In Proceedings of the 1986 ACM Conference on LISP and Functional Programming, LFP ’86, pages 151–161, New York, NY, USA, 1986. ACM.

[6] Paul Klint, Tijs van der Storm, and Jurgen J. Vinju. Rascal: A domain specific language for source code analysis and manipulation. In Ninth IEEE International Working Con- ference on Source Code Analysis and Manipulation, SCAM 2009, Edmonton, Alberta, Canada, September 20-21, 2009, pages 168–177. IEEE Computer Society, 2009.

[7] Jeroen van den Bos, Mark Hills, Paul Klint, Tijs van der Storm, and Jurgen J. Vinju. Rascal: From algebraic specification to meta-programming. In Proceedings Second Inter- national Workshop on Algebraic Methods in Model-based Software Engineering, AMMSE 2011, Zurich, Switzerland, 30th June 2011., pages 15–32, 2011.

[8] Paul Klint, Tijs van der Storm, and Jurgen J. Vinju. Easy meta-programming with rascal. leveraging the extract-analyze-synthesize paradigm for meta-programming. In Proceedings of the 3rd International Summer School on Generative and Transformational Techniques in Software Engineering (GTTSE’09), LNCS. Springer, 2010. to appear.

80 [9] A. Van Deursen, P. Klint, and F. Tip. Origin tracking. JOURNAL OF SYMBOLIC COMPUTATION, 15:523–545, 1992. [10] J. Nathan Foster, Michael B. Greenwald, Jonathan T. Moore, Benjamin C. Pierce, and Alan Schmitt. Combinators for bidirectional tree transformations: A linguistic approach to the view-update problem. ACM Trans. Program. Lang. Syst., 29(3), May 2007. [11] David Pacheco. Postmortem debugging in dynamic environments. Queue, 9(10):12:10– 12:21, October 2011. [12] Junsong Li, Justin Pombrio, Joe Gibbs Politz, and Shriram Krishnamurthi. Slimming languages by reducing sugar: A case for semantics-altering transformations. In 2015 ACM International Symposium on New Ideas, New Paradigms, and Reflections on Pro- gramming and Software (Onward!), Onward! 2015, pages 90–106, New York, NY, USA, 2015. ACM. [13] Alonzo Church. An unsolvable problem of elementary number theory. American Journal of Mathematics, 58(2):345–363, April 1936. [14] David Hilbert and Wilhelm Ackermann. Grundz¨ugeder theoretischen logik (1928). Be- nutzte Ausgabe: Berlin: Springer, 1, 1938. [15] A. M. Turing. On computable numbers, with an application to the Entscheidungsprob- lem. 42:230–265, 1936. [16] Chris F Kemerer. An empirical validation of software cost estimation models. Commun. ACM, 30(5):416–429, May 1987. [17] J. W. Klop. Handbook of logic in computer science (vol. 2). chapter Term Rewriting Systems, pages 1–116. Oxford University Press, Inc., New York, NY, USA, 1992. [18] Paul Stansifer and Mitchell Wand. Romeo: A system for more flexible binding-safe programming. In Proceedings of the 19th ACM SIGPLAN International Conference on Functional Programming, ICFP ’14, pages 53–65, New York, NY, USA, 2014. ACM. [19] Jacob Andersen, Claus Brabrand, and David Raymond Christiansen. Banana algebra: Compositional syntactic language extension. Science of Computer Programming, 78(10). [20] M. Heimensen and T. van der Storm. Tool-supported language extensions for javascript. Unpublished Master’s Thesis, 2015. [21] John Maraist, Martin Odersky, David N. Turner, and Philip Wadler. Call-by-name, call-by-value, call-by-need, and the linear lambda calculus. 1994. [22] Arjun Guha, Claudiu Saftoiu, and Shriram Krishnamurthi. The essence of javascript. In Proceedings of the 24th European Conference on Object-oriented Programming, ECOOP’10, pages 126–150, Berlin, Heidelberg, 2010. Springer-Verlag. [23] Erik Meijer, Maarten Fokkinga, and Ross Paterson. Functional programming with ba- nanas, lenses, envelopes and barbed wire. In Proceedings of the 5th ACM Conference on Functional Programming Languages and Computer Architecture, pages 124–144, New York, NY, USA, 1991. Springer-Verlag New York, Inc. [24] Peter Seibel. Practical Common Lisp. Apress, Berkely, CA, USA, 1st edition, 2012.

81 Appendix A

Prototype Implementation

In this appendix, we provide an overview of our prototype implementation. Due to the size of the source code of our implementation, we will not list the source code here. However, the full source code of our prototype can be found at: https://github.com/wnederhof/rascal In total, over 141 files were changed or created in the repository (including generated files and log files) according to Github, of which 65 files were not generated nor log files.1. In the next section, we will list and discuss the altered and newly created files. In Chapter A.2, we will discuss how our prototype may be installed.

A.1 Altered Files

We have altered or created the following classes (excluding generated classes, log files and files that have become irrelevant):

A.1.1 Operations on patterns and terms

Visit Match Results

The following classes were altered to allow for visiting match results matched to a term: org.rascalmpl.interpreter.matching.AbstractMatchingResult org.rascalmpl.interpreter.matching.AntiPattern org.rascalmpl.interpreter.matching.ConcreteAmbiguityPattern org.rascalmpl.interpreter.matching.ConcreteApplicationPattern org.rascalmpl.interpreter.matching.ConcreteListPattern

1See: https://github.com/cwi-swat/rascal/compare/master...wnederhof:master

82 org.rascalmpl.interpreter.matching.ConcreteListVariablePattern org.rascalmpl.interpreter.matching.ConcreteOptPattern org.rascalmpl.interpreter.matching.DescendantPattern org.rascalmpl.interpreter.matching.GuardedPattern org.rascalmpl.interpreter.matching.IMatchingResult org.rascalmpl.interpreter.matching.ListPattern org.rascalmpl.interpreter.matching.LiteralPattern org.rascalmpl.interpreter.matching.MapPattern org.rascalmpl.interpreter.matching.MultiVariablePattern org.rascalmpl.interpreter.matching.NegativePattern org.rascalmpl.interpreter.matching.NodePattern org.rascalmpl.interpreter.matching.QualifiedNamePattern org.rascalmpl.interpreter.matching.RegExpPatternValue org.rascalmpl.interpreter.matching.ReifiedTypePattern org.rascalmpl.interpreter.matching.SetPattern org.rascalmpl.interpreter.matching.TuplePattern org.rascalmpl.interpreter.matching.TypedMultiVariablePattern org.rascalmpl.interpreter.matching.TypedVariablePattern org.rascalmpl.interpreter.matching.ValuePattern org.rascalmpl.interpreter.matching.VariableBecomesPattern

Classes Using Visit Match Results

The following files are used for substitution, nodeId tagging and analysis. org.rascalmpl.interpreter.SubstitutionEvaluator org.rascalmpl.interpreter.matching.visitor.IValueMatchingResultVisitor org.rascalmpl.interpreter.matching.visitor.IdentityValueMatchingResultVisitor org.rascalmpl.interpreter.matching.visitor.InnerOuterVisitor org.rascalmpl.interpreter.matching.visitor.PatternNodeIdAccumulator org.rascalmpl.interpreter.matching.visitor.SubstitutionVisitor org.rascalmpl.interpreter.matching.visitor.TagDesugared

83 A.1.2 Environments

The following environments were altered or created: org.rascalmpl.interpreter.env.EmptyVariablesEnvironment org.rascalmpl.interpreter.env.Environment org.rascalmpl.interpreter.env.StayInScopeEnvironment EmptyVariablesEnvironment is used for creating an empty environment, StayInScopeEnvi- ronment is used to ensure that an returned by a function cannot alter that function’s data.

A.1.3 Desugaring and Resugaring

The following classes were used to implement the semantics for desugaring and resugar- ing: org.rascalmpl.interpreter.result.sugar.DesugarFunction org.rascalmpl.interpreter.result.sugar.FallbackSugarFunction org.rascalmpl.interpreter.result.sugar.ResugarFunction org.rascalmpl.interpreter.sugar.Desugar org.rascalmpl.interpreter.sugar.DesugarTransformer org.rascalmpl.interpreter.sugar.Resugar org.rascalmpl.interpreter.sugar.ResugarTransformer org.rascalmpl.interpreter.sugar.SugarParameters SugarParameters is used for tagging terms. ”Transformer” classes are visitors that visit each node and attempts to desugar or resugar that node.

A.1.4 Bootstrapping and Syntax

The following files were altered for the adjusted syntax of our prototype: src/org/rascalmpl/library/lang/rascal/grammar/Bootstrap.rsc src/org/rascalmpl/library/lang.rascal.syntax/Rascal.rsc org.rascalmpl.library.lang.rascal.syntax.RascalParser org.rascalmpl.parser.ASTBuilder

84 A.1.5 Semantics

The following classes were altered to add semantics to our definitions: org.rascalmpl.semantics.dynamic.Declaration org.rascalmpl.semantics.dynamic.Expression org.rascalmpl.semantics.dynamic.FunctionDeclaration org.rascalmpl.semantics.dynamic.Import org.rascalmpl.semantics.dynamic.Statement org.rascalmpl.semantics.dynamic.SyntaxDefinition org.rascalmpl.semantics.dynamic.Tree Finally, the following class was altered to fix a bug in Rascal:2 org.rascalmpl.values.uptr.RascalValueFactory

A.2 Installation Instructions

Our prototype can be installed by following the instructions on: https://github.com/cwi-swat/rascal/wiki/Rascal-Developers-Setup—Step-by-Step And by substituting rascal’s clone URL by our implementation’s clone URL: https://github.com/wnederhof/rascal.git Note, that this process may fail if any backwards incompatible changes have been made to the repositories using Rascal. However, since our latest local merge with Rascal was on Aug 17, 2015, our implementation should work with the repository versions from that date. This may also apply to the page containing the installation instructions.

2Bug fix: https://github.com/cwi-swat/rascal/issues/867

85 Appendix B

Unit Tests

We believe that our case studies provide sufficient evidence that our prototype works in most ”general” cases. As such, we only provide unit tests for several important (edge) cases1.

B.1 Emulation Property

This simple test is designed to demonstrate the emulation property that our prototype adheres to. We change a value in the core representation, and validate that this value is also altered in the surface representation.

1 module testcases::Isomorphism 2 import IO; 3 4 data TestData = a() | b() | e(TestData a1, TestData a2) | f(TestData a1, TestData a2); 5 6 public TestData sugarIsomorphism(e(TestData a1, TestData a2)) ⇒ f(TestData a1, TestData a2); 7 8 /** 9 * In this test we demonstrate that we can change a value on the desugared tree which 10 * reflects to the value in the resugared tree. 11 */ 12 public test bool testIsomorphism() { 13 TestData orig = e(a(), a()); 14 TestData desugared = desugar-all; 15 TestData y = b(); 16 return resugar-all == e(a(), b()); 17 } 18 19 /** 20 * In this test we demonstrate that we can change a value on the desugared tree which

1The full source code of our case studies can be found at: https://github.com/wnederhof/casestudies-thesis

86 21 * reflects to the value in the resugared tree. 22 */ 23 public test bool testIsomorphismControl() { 24 TestData x = e(a(), a()); 25 TestData y = desugar-all; 26 return resugar-all == e(a(), a()); 27 }

B.2 Fixing Ellipses Lengths

This test demonstrates that ellipses’ lengths are fixed correctly.

1 module testcases::FixedLength 2 3 private data TestData = a() | b() | c(list[TestData] datas1) | d(list[TestData] datas2); 4 5 @fixedLength{x} 6 public TestData sugarWithFixedLength(c([*TestData x, b(), *TestData y])) 7 ⇒ d([*TestData x, *TestData y]); 8 public TestData sugarWithoutFixedLength(c([*TestData x, b(), *TestData y])) 9 ⇒ d([*TestData x, *TestData y]); 10 11 /* 12 * In this test, we check if the fixedLength tag works properly. 13 */ 14 public test bool testEllipses() { 15 TestData orig = c([a(), b(), a()]); 16 return resugar-all> == orig; 17 } 18 19 /* 20 * Control test. 21 */ 22 public test bool testEllipsesControl() { 23 TestData orig = c([a(), b(), a()]); 24 return resugar-all> != orig; 25 }

B.3 Node Identity

This test demonstrates that nodes’ identities are tracked.

1 module testcases::NodeIdentity 2 3 private data TestData = a() | b(); 4

87 5 public TestData simpleSugar(a()) ⇒ b(); 6 7 resugarable node; 8 9 /* 10 * In this test, we remove a node’s identity and check if it does not resugar. 11 */ 12 public test bool testRemovingIdentity() { 13 TestData t = desugar-all; 14 t = t[@__nodeId=""]; 15 return resugar-all != a(); 16 } 17 18 /* 19 * Control test. 20 */ 21 public test bool testWithIdentity() { 22 TestData t = desugar-all; 23 return resugar-all == a(); 24 }

B.4 Output Type

This test demonstrates that the output type of a sugar function is checked.

1 module testcases::OutputType 2 3 import IO; 4 5 private data TestData = a() | b(); 6 7 public int testWrongType(a()) * ⇒ b(); 8 9 /* 10 * This test checks whether the output type of the core pattern corresponds with the 11 * desugared output type or fails. 12 * 13 * NOTE: This test must be run separately and fail!! 14 */ 15 public bool testConstraints() { 16 println("Fail: Type mismatch: > with int"); 17 return false; 18 }

B.5 When Conditions

This test demonstrates that when-conditions and the @ensureUnchanged tags work.

88 1 module testcases::WhenConditions 2 3 syntax Exp = "a" Exp rest | "b"; 4 5 @ensureUnchanged{nextExp} 6 public Exp testWhenCondWithTag((Exp)‘aab‘) ⇒ (Exp)‘a‘ 7 when Exp nextExp := (Exp)‘b‘; 8 9 public Exp testWhenCondNoTag((Exp)‘aab‘) ⇒ (Exp)‘a‘ 10 when Exp nextExp := (Exp)‘b‘; 11 12 /* 13 * In this test, we check that variables tracked using @ensureUnchanged 14 * cannot be changed after desugaring, or resugaring will fail. 15 */ 16 public test bool testWhenConditions() { 17 Exp b = testWhenCondWithTag((Exp)‘aab‘); 18 Exp d = (Exp)‘aaaaab‘; 19 Exp c = (Exp)‘a‘ <<< b; 20 if (resugar-all == (Exp)‘aab‘) { 21 return false; 22 } 23 d = (Exp)‘b‘; 24 c = (Exp)‘a‘ <<< b; 25 if (resugar-all != (Exp)‘aab‘) { 26 return false; 27 } 28 return true; 29 } 30 31 /* 32 * Control test. 33 */ 34 public test bool testWhenConditionsNoTag() { 35 Exp b = testWhenCondNoTag((Exp)‘aab‘); 36 Exp d = (Exp)‘aaaaab‘; 37 Exp c = (Exp)‘a‘ <<< b; 38 if (resugar-all != (Exp)‘aab‘) { 39 return false; 40 } 41 d = (Exp)‘b‘; 42 c = (Exp)‘a‘ <<< b; 43 if (resugar-all != (Exp)‘aab‘) { 44 return false; 45 } 46 return true; 47 }

89 B.6 Automatic Test Generation

This test demonstrates that our prototype produces isomorphic terms using terms built from abstract datatypes.

1 module testcases::Arbitrary 2 3 import IO; 4 5 private data TestData 6 = a() 7 | b(TestData v1) 8 | c(TestData v1, TestData v2) 9 | d(set[TestData] elems) 10 | e(list[TestData] elemsLst) 11 ; 12 13 private data TestDataOut 14 = _a() 15 | _b(TestDataOut v1) 16 | _c(TestDataOut v1, TestDataOut v2) 17 | _d(set[TestDataOut] elems) 18 | _e(list[TestDataOut] elemsLst) 19 ; 20 21 private TestDataOut sugarFunctionCompositional(a()) ⇒ _a(); 22 private TestDataOut sugarFunctionCompositional(b(TestData v1)) ⇒ _b(TestDataOut v1); 23 private TestDataOut sugarFunctionCompositional(c(TestData v1, TestData v2)) ⇒ _c(TestDataOut v1, TestDataOut v2); 24 private TestDataOut sugarFunctionCompositional(d(set[TestData] elems)) ⇒ _d(set [TestDataOut] elems); 25 private TestDataOut sugarFunctionCompositional(e(list[TestData] elemsLst)) ⇒ _e (list[TestDataOut] elemsLst); 26 27 private TestData sugarFunctionIntermediate(a()) ⇒ b(a()); 28 private TestData sugarFunctionIntermediate(b(TestData t)) ⇒ c(TestData t, a()); 29 private TestData sugarFunctionIntermediate(c(TestData t1, TestData t2)) ⇒ a(); 30 private TestData sugarFunctionIntermediate(d(set[TestData] tds)) ⇒ a(); 31 private TestData sugarFunctionIntermediate(e(list[TestData] tds)) ⇒ a(); 32 33 public test bool testArbitraryCompositional(TestData t) { 34 if (resugar-all> != t) { 35 println(t); 36 println(resugar-all>); 37 return true; 38 } 39 return true; 40 } 41 42 public test bool testArbitraryIntermediate(TestData t) { 43 if (resugar-all> != t) { 44 println(t);

90 45 println(resugar-all>); 46 return true; 47 } 48 return true; 49 }

B.7 Duplicates

This test demonstrates that, in contrast to the techniques by J. Pombrio and S. Krishna- murthi, we allow for variable reuse in syntactic sugar transformations.

1 module testcases::Duplicate 2 3 import IO; 4 5 data TestData = a() | b() | c(TestData v1, TestData v2) | d(TestData v1, TestData v2); 6 7 public TestData sugarDuplicate(c(TestData v1, v1)) ⇒ TestData v1; 8 public TestData sugarDuplicate2(c(TestData v1, v1)) ⇒ d(TestData v1, v1); 9 10 /* 11 * First check: desugaring according to the pattern succeeds. 12 */ 13 public test bool testDuplicate() { 14 return desugar-all == a() 15 && desugar-all == b(); 16 } 17 18 19 /* 20 * Second check: resugaring succeeds. 21 */ 22 public test bool testDuplicateResugars() { 23 return resugar-all> == c(a(), a()) 24 && resugar-all> == c(b(), b()); 25 } 26 27 /* 28 * Third check: pattern matching does not work when the first v1 is different from the second. 29 */ 30 public test bool testDuplicateFail() { 31 return desugar-all == c(a(), b()) 32 && desugar-all == c(b(), a()); 33 } 34 35 /* 36 * Fourth check: altering the term replacing v1 at d(_, v1) disallows for resugaring. 37 */

91 38 public test bool testDuplicateFailAfterResugaring() { 39 TestData t = desugar-all; 40 TestData x = b(); 41 return resugar-all == d(a(), b()); 42 } 43 44 /* 45 * Fifth check: changing v1 to a() *will* allow for resugaring 46 */ 47 public test bool testDuplicateFailAfterResugaringControl1() { 48 TestData t = desugar-all; 49 TestData x = a(); 50 return resugar-all == c(a(), a()); 51 } 52 53 /* 54 * Final check: sugarDuplicate2 normally desugars and resugars if all parameters are correct. 55 */ 56 public test bool testDuplicateFailAfterResugaringControl2() { 57 TestData t = desugar-all; 58 return resugar-all == c(a(), a()); 59 }

92 Appendix C

Banana Algebra Transpiler

The following program was used to convert the Fun programming language found in the Banana Algebra[19] examples1 into Rascal. This program was written in Common Lisp, due to issues related to parsing large files in Rascal2. Note that the sole purpose of this artefact was to bootstrap the Fun language in Rascal, and that we altered the generated Rascal program afterwards. As such, when we were sufficiently pleased with the result, we have not altered the generator since. To replicate the conversion process, alter the filenames defined in main, and call (main) using a Common Lisp[24] interpreter (we used SBCL3)4.

1 (defvar *curp* "") 2 (defvar *l-from* nil) 3 (defvar *l-to* nil) 4 (defvar *from-to-map* nil) 5 (defvar *input-stream* nil) 6 (defvar *rascal-keywords* 7 ’("o" "syntax" "keyword" "lexical" "int" "break" "continue" "rat" "true" 8 "bag" "num" "node" "finally" "private" "real" "list" "fail" "filter" 9 "if" "tag" "resugarable" "extend" "append" "rel" "lrel" "void" "non-assoc" 10 "assoc" "test" "anno" "layout" "data" "join" "it" "bracket" "in" "import" 11 "false" "all" "dynamic" "solve" "type" "try" "catch" "notin" "else" "insert" 12 "switch" "return" "case" "while" "str" "throws" "visit" "tuple" "for" 13 "assert" "loc" "default" "map" "alias" "any" "module" "mod" "bool" "public" 14 "one" "throw" "set" "start" "datetime" "value" "unexpand-one" "desugar-all" 15 "resugar-all" "loc" "node" "num" "type" "bag" "int" "rat" "rel" "lrel" 16 "real" "tuple" "str" "bool" "void" "datetime" "set" "map" "list")) 17 18 (defun is-rascal-keyword (name) 19 (some (lambda (x) (string= x name)) *rascal-keywords*)) 20 21 (defun escape-name-if-necessary (name) 22 (if (is-rascal-keyword name) (concatenate ’string "\\" name) name)) 23 24 (defun next-char ()

1See: http://www.itu.dk/people/brabrand/banana-algebra/ 2See: https://github.com/cwi-swat/rascal/issues/866 3See: http://www.sbcl.org/ 4The full source code of our case studies can be found at: https://github.com/wnederhof/casestudies-thesis

93 25 "Reads the next character from the input stream." 26 (let ((c (read-char *input-stream*))) 27 c)) 28 29 (defun prev-char (c) 30 "Puts the provided char back onto the input stream." 31 (unread-char c *input-stream*)) 32 33 (defun trim-stream () 34 "Removes all redundant whitespace characters from stream." 35 (if (or (char= #\Tab (peek-char nil *input-stream*)) 36 (char= #\Return (peek-char nil *input-stream*)) 37 (char= #\Space (peek-char nil *input-stream*)) 38 (char= #\Linefeed (peek-char nil *input-stream*))) 39 (progn 40 (next-char) 41 (trim-stream)) 42 nil)) 43 44 (defun repeat-fn (n a) 45 (if (> n 0) (progn 46 (funcall a) 47 (repeat-fn (- n 1) a)) 48 nil)) 49 50 (defmacro repeat (n a) 51 "Very simple macro for repeating a n-times." 52 ‘(repeat-fn ,n (lambda () ,a))) 53 54 (defun accepts (str) 55 "Checks if the provided string can be accepted." 56 (if (= (length str) 0) 57 t 58 (let ((c (char str 0)) 59 (r (next-char))) 60 (let ((res (and (equal r c) (accepts (subseq str 1))))) 61 (prev-char r) 62 res)))) 63 64 (defun accept (str) 65 "Shift the parser’s string when the provided string is accepted. Throws ’ no-accept if unacceptable input." 66 (trim-stream) 67 (if (not (accepts str)) 68 (throw ’no-accept t) 69 (progn 70 (repeat (length str) (next-char)) 71 str))) 72 73 (defun is-next-char-special () 74 (let ((r (peek-char nil *input-stream*))) 75 (if (or (char= #\$ r) 76 (char= #\_ r))

94 77 r nil))) 78 79 (defun is-next-char-az () 80 (let ((r (peek-char nil *input-stream*))) 81 (if (or (and (char<= #\a r) (char>= #\z r)) 82 (and (char<= #\A r) (char>= #\Z r))) 83 r nil))) 84 85 (defun is-next-char-09 () 86 "TODO: Takes n steps. Use char>= and char<=." 87 (let ((r (peek-char nil *input-stream*))) 88 (if (or (char= #\0 r) (char= #\1 r) (char= #\2 r) (char= #\3 r) (char= #\4 r ) 89 (char= #\5 r) (char= #\6 r) (char= #\7 r) (char= #\8 r) (char= #\9 r)) 90 r nil))) 91 92 (defun is-next-char-az-09-dot () 93 (let ((dot (char= #\. (peek-char nil *input-stream*)))) 94 (or dot (is-next-char-az) (is-next-char-09)))) 95 96 (defun read-name () 97 (if (or (is-next-char-az-09-dot) (is-next-char-special)) 98 (concatenate ’string (string (next-char)) (read-name)) 99 "")) 100 101 (defun accept-name () 102 (trim-stream) 103 (if (not (or (is-next-char-special) (is-next-char-az))) 104 (throw ’no-accept t) 105 (read-name))) 106 107 (defun accepts-string () 108 (trim-stream) 109 (char= (peek-char nil *input-stream*) #\")) 110 111 (defun read-string () 112 (let ((c (next-char))) 113 (if (char= c #\") 114 "" 115 (concatenate ’string (string c) (read-string))))) 116 117 (defun accept-string () 118 (if (not (accepts-string)) 119 (throw ’no-accept t) 120 (progn 121 (next-char) 122 (read-string)))) 123 124 (defun accept-func () 125 "Accepts a function-style statement." 126 (let ((name (accept-name))) 127 (accept "(") 128 (let ((args (accept-args)))

95 129 (accept ")") 130 ‘(,name ,@args)))) 131 132 (defun accept-num () 133 (if (is-next-char-09) 134 (let ((c (string (next-char)))) 135 (concatenate ’string c (accept-num))) 136 "")) 137 138 (defun accept-ref () 139 (accept "$") 140 ‘($ref ,(accept-num))) 141 142 (defun accepts-ref () 143 "Accepts a dollar. Note, however, this function does not check if of form $." 144 (accepts "$")) 145 146 (defun accept-arg () 147 (if (accepts-string) 148 (accept-string) 149 (if (accepts-ref) 150 (accept-ref) 151 (accept-func)))) 152 153 (defun accept-args () 154 "Accept arguments of the form: {Arg a ’,’}* until ’)’ is hit." 155 (if (accepts ")") 156 ’() 157 (let ((arg (accept-arg))) 158 (if (accepts ",") 159 (progn 160 (accept ",") 161 (if (not (accepts ")")) 162 (append ‘(,arg) (accept-args)) 163 (throw ’no-accept t))) 164 (if (not (accepts ")")) 165 (throw ’no-accept t) 166 ‘(,arg)))))) 167 168 (defun parse-cata () 169 "Parse a single Constant Catamorphism (the c in ‘(| L -> L [T] c |)‘)" 170 (let ((name (accept-name))) 171 (accept "=") 172 (let ((arg (accept-arg))) 173 (accept ";") 174 ‘(,name ,arg)))) 175 176 (defun parse-catas () 177 (trim-stream) 178 (if (accepts "|)") 179 ’() 180 (append (parse-cata) (parse-catas))))

96 181 182 (defun accept-lexical () 183 (trim-stream) 184 (let ((c (peek-char nil *input-stream*))) 185 (if (char= c #\;) 186 "" 187 (concatenate ’string (string (next-char)) (accept-lexical))))) 188 189 (defun accept-syntactical-part () 190 (trim-stream) 191 (if (accepts-string) 192 ‘(string ,(accept-string)) 193 ‘(name ,(accept-name)))) 194 195 (defun accept-syntax () 196 (trim-stream) 197 (if (not (accepts ";")) 198 (append ‘(,(accept-syntactical-part)) (accept-syntax)) 199 ’())) 200 201 ;(defun store-language-def (name val) 202 ; (setf (gethash name *language-table*) val)) 203 204 (defun accept-language-rule () 205 (let ((name (accept-name))) 206 (trim-stream) 207 (if (accepts ":") 208 (progn 209 (accept ":") 210 ‘(syntax ,name ,@(accept-syntax))) 211 (if (accepts "=") 212 (progn 213 (accept "=") 214 ‘(lexical ,name ,(accept-lexical))) 215 (throw ’no-accept t))))) 216 217 (defun accept-language-rules () 218 (trim-stream) 219 (if (not (accepts "}")) 220 (let ((r (accept-language-rule))) 221 (accept ";") 222 (append ‘(,r ,@(accept-language-rules)))) 223 ’())) 224 225 (defun accept-language () 226 (accept "{") 227 (let ((rules (accept-language-rules))) 228 (accept "}") 229 rules)) 230 231 (defun accept-type () 232 (let ((from (accept-name))) 233 (accept "->")

97 234 (let ((to (accept-name))) 235 ‘(,from ,to)))) 236 237 (defun accept-types () 238 (trim-stream) 239 (if (not (accepts "]")) 240 (let ((type (accept-type))) 241 (trim-stream) 242 (if (not (accepts "]")) 243 (progn 244 (accept ",") 245 (append ‘(,type) (accept-types))) 246 ‘(,type))) 247 ’())) 248 249 (defun parse (s) 250 (setf *input-stream* s) 251 (accept "(|") 252 (let ((l-from (accept-language))) (accept "->") 253 (let ((l-to (accept-language))) (accept "[") 254 (let ((types (accept-types))) (accept "]") 255 (let ((catas (parse-catas))) (accept "|)") 256 (replace-underscores ‘(,l-from ,l-to ,types ,catas))))))) 257 258 (defun create-l-map2 (m ls) 259 (let ((l (car ls))) 260 (if (not l) 261 m 262 (progn 263 (setf (gethash (cadr l) m) (cddr l)) 264 (create-l-map2 m (cdr ls)))))) 265 266 (defun create-l-map (ls) 267 (create-l-map2 268 (make-hash-table :test ’equal) ls)) 269 270 (defun create-from-to-map2 (m types) 271 (if (not (car types)) 272 m 273 (let ((from (caar types)) (to (cadar types))) 274 (setf (gethash from m) to) 275 (create-from-to-map2 m (cdr types))))) 276 277 (defun create-from-to-map (types) 278 (create-from-to-map2 279 (make-hash-table :test ’equal) types)) 280 281 (defun canonical-name (n) 282 (if (= (length n) 0) "" 283 (let ((first (subseq n 0 1)) 284 (rest (subseq n 1))) 285 (if (string= first ".") 286 ""

98 287 (concatenate ’string first (canonical-name rest)))))) 288 289 (defun to-type (cata-name) 290 (gethash (canonical-name cata-name) *from-to-map*)) 291 292 (defun lhs-syntax-apply (cata-parts idx) 293 (if (not cata-parts) 294 "" 295 (let ((type (caar cata-parts)) 296 (value (cadar cata-parts))) 297 (cond 298 ((eq type ’string) 299 (concatenate ’string 300 value " " 301 (lhs-syntax-apply (cdr cata-parts) idx))) 302 ((eq type ’name) 303 (concatenate ’string ;value 304 "<" value " arg" (write-to-string (+ 1 idx)) "> " 305 (lhs-syntax-apply (cdr cata-parts) (+ 1 idx)))) 306 (T (print type) (throw ’unknown-type t)))))) 307 308 (defun lhs-syntax (cata-name) 309 (string-trim " \n\r" 310 (lhs-syntax-apply 311 (gethash cata-name *l-from*) 0))) 312 313 (defun replace-if (c c-match res) 314 (if (string= c c-match) 315 res c)) 316 317 (defun escape-rule-rhs (r) 318 (if (= 0 (length r)) 319 "" 320 (concatenate ’string 321 (let ((c (subseq r 0 1))) 322 (replace-if 323 (replace-if c "<" "\\<") 324 ">" "\\>")) 325 (escape-rule-rhs (subseq r 1))))) 326 327 (defun transform-fun2 (to eval-args) 328 (if (stringp (car to)) (car eval-args) 329 (cond ((eq (caar to) ’name) 330 (concatenate ’string 331 (car eval-args) 332 (transform-fun2 (cdr to) (cdr eval-args)))) 333 ((eq (caar to) ’string) 334 (concatenate ’string 335 (escape-rule-rhs (cadar to)) 336 (transform-fun2 (cdr to) eval-args)))))) 337 338 (defun transform-fun (eval-fun eval-args) 339 (let ((to (gethash eval-fun *l-to*)))

99 340 (transform-fun2 to eval-args))) 341 342 (defun is-reference (cata) (eq (car cata) ’$ref)) 343 344 (defun ref-arg (cata) (concatenate ’string "")) ; to do 345 346 (defun eval-rhs (cata) 347 (cond ((stringp cata) cata) 348 ((is-reference cata) (ref-arg cata)) 349 (T (let ((eval-fun (car cata)) 350 (eval-args (map ’list #’eval-rhs (cdr cata)))) 351 (transform-fun eval-fun eval-args))))) 352 353 (defun rhs-syntax (cata-content) (eval-rhs cata-content)) 354 355 (defun interpret-cata (cata-name cata-content) 356 (concatenate ’string 357 "private " (to-type cata-name) 358 " desugar((" (canonical-name cata-name) ")‘" (lhs-syntax cata-name) "‘) 359 ⇒ (" (canonical-name cata-name) ") (" (to-type cata-name) ")‘" (rhs-syntax cata-content) "‘; 360 ")) 361 362 (defun interpret-catas (catas) 363 (if catas 364 (concatenate ’string 365 (interpret-cata (car catas) (cadr catas)) 366 (interpret-catas (cddr catas))) 367 "")) 368 369 (defun replace-underscores (where) 370 "TO DO: Obviously not very generic..." 371 (subst "WS__" "__" (subst "WS_" "_" where :test ’equal) :test ’equal)) 372 373 (defun sub-name (n) 374 (if (= (length n) 0) "" 375 (let ((first (subseq n 0 1)) 376 (rest (subseq n 1))) 377 (if (string= first ".") 378 rest 379 (sub-name rest))))) 380 381 (defun create-syntax-rhs-el (syntax-rhs-el) 382 (if (eq (car syntax-rhs-el) ’string) 383 (concatenate ’string 384 "\"" (cadr syntax-rhs-el) "\" ") 385 (concatenate ’string 386 (cadr syntax-rhs-el) " "))) 387 388 (defun create-syntax-rhs (syntax-rhs) 389 (if syntax-rhs 390 (concatenate ’string 391 (create-syntax-rhs-el (car syntax-rhs))

100 392 (create-syntax-rhs (cdr syntax-rhs))) 393 "")) 394 395 (defun create-syntax (syntax) 396 (if (eq (car syntax) ’syntax) 397 (concatenate ’string "syntax " (canonical-name (cadr syntax)) 398 " = " (escape-name-if-necessary (sub-name (cadr syntax))) 399 ": " (create-syntax-rhs (cddr syntax)) "; 400 ") 401 (create-syntax-rhs (cdddr syntax)))) 402 403 (defun create-syntaxes (language) 404 (if language 405 (concatenate ’string 406 (create-syntax (car language)) 407 (create-syntaxes (cdr language))) 408 "")) 409 410 (defun interpret (l-from l-to types catas) 411 (setf *l-from* (create-l-map l-from)) 412 (setf *l-to* (create-l-map l-to)) 413 (setf *from-to-map* (create-from-to-map types)) 414 (concatenate ’string 415 " 416 417 " 418 (create-syntaxes l-from) 419 " 420 421 " 422 (create-syntaxes l-to) 423 " 424 425 " 426 (interpret-catas catas))) 427 428 (defun write-file (name content) 429 (with-open-file (stream name 430 :direction :output 431 :if-exists :overwrite 432 :if-does-not-exist :create) 433 (format stream content))) 434 435 (defun main () 436 (with-open-file (stream "/home/wouter/Downloads/bananana/fun/fun2lambda.pretty" :direction :input) 437 (write-file "~/fun2lambda.rascal" 438 (apply ’interpret 439 (parse stream)))))

101 Appendix D

Case Studies

In this Appendix, we will list the source code for our case studies. Note that we omit the source code of the transformations for the Banana Algebra Fun language due to the size of the output transformations1.

D.1 ES6 to ES5 Resugaring

D.1.1 Arrow Functions

1 module ecma::extensions::arrow::Desugar 2 3 extend ecma::extensions::arrow::Syntax; 4 5 Expression arrowSugarInner((Expression)‘this‘) ⇒ (Expression)‘_this‘; 6 7 Expression arrowSugarInner((Expression)‘arguments‘) ⇒ (Expression)‘_arguments‘; 8 9 Expression arrowSugarInner((Statement)‘‘ | f -> arrowSugar) 10 ⇒ (Expression)‘‘; 11 12 Expression arrowSugarInner((Expression)‘‘ | f -> arrowSugar) 13 ⇒ (Expression)‘‘; 14 15 Expression arrowSugarInner(e:(Expression)‘() =\> { }‘) 16 = arrowSugar(e); 17 18 Expression arrowSugarInner(e:(Expression)‘ =\> { }‘) 19 = arrowSugar(e); 20 21 Expression arrowSugarInner(e:(Expression)‘() =\> ‘) 22 = arrowSugar(e); 23

1The full source code of our case studies can be found at: https://github.com/wnederhof/casestudies-thesis

102 24 Expression arrowSugarInner((Expression)‘ =\> ‘) 25 = arrowSugar(e); 26 27 Expression arrowSugar((Expression)‘() =\> { }‘ | body -> arrowSugarInner) 28 ⇒ (Expression)‘(function(_this,_arguments) { 29 ’ return function() { 30 ’ 31 ’ }; 32 ’})(this, undefined)‘; 33 34 Expression arrowSugar((Expression)‘ =\> { }‘ | body -> arrowSugarInner) 35 ⇒ (Expression)‘(function(_this,_arguments) { 36 ’ return function() { 37 ’ 38 ’ }; 39 ’})(this, undefined)‘; 40 41 Expression arrowSugar((Expression)‘() =\> ‘ | body -> arrowSugarInner) 42 ⇒ (Expression)‘(function(_this,_arguments) { 43 ’ return function() { 44 ’ return ; 45 ’ }; 46 ’})(this, undefined)‘; 47 48 Expression arrowSugar((Expression)‘ =\> ‘ | body -> arrowSugarInner) 49 ⇒ (Expression)‘(function(_this,_arguments) { 50 ’ return function() { 51 ’ return ; 52 ’ }; 53 ’})(this, undefined)‘; 54 55 Source desugar(Source src) = desugar-all; 56 Source desugar_resugar(Source src) = resugar-all>;

D.1.2 Literals

1 module ecma::extensions::literal::Desugar 2 extend ecma::extensions::literal::Syntax; 3 4 import String; 5 import util::Math; 6 import IO; 7 8 @ensureUnchanged{out} 9 Expression sugar( (Expression)‘‘ ) ⇒ (Expression)‘‘ 10 when out := numberSugar((Numeric)‘‘, 2, 2); 11 12 @ensureUnchanged{out}

103 13 Expression sugar( (Expression)‘‘ ) ⇒ (Expression)‘‘ 14 when out := numberSugar((Numeric)‘‘, 2, 8); 15 16 private int convertFromBase( int base, str input ) { 17 int s = 0; 18 int n = size(input); 19 for( str d ← split("",input) ) { 20 n -= 1; 21 s += toInt(d) * floor( pow(base,n) ); 22 } 23 return s; 24 } 25 26 private Numeric numberSugar( Numeric n, int begin, int base ) { 27 return ([Numeric]"", begin))>"); 28 }

D.1.3 Parameters

1 module ecma::extensions::parameters::Desugar 2 extend ecma::extensions::parameters::Syntax; 3 4 resugarable node; 5 6 value generateUId; 7 8 Id(str) generateNamer() { 9 set[str] names = {}; 10 Id generateUniqueId( str name ) { 11 if( name in names ) name = gensym(names, name); 12 names += name; 13 return [Id]name; 14 } 15 return generateUniqueId; 16 } 17 18 @ensureUnchanged{initBody} 19 Function functionSugar((Function)‘function (...) { }‘) 20 * ⇒ (Function)‘function () { 21 ’ 22 ’ 23 ’}‘ 24 when Statement initBody := spreadParameter( rest, 0, generateUId ); 25 26 @ensureUnchanged{initBody} 27 Function functionSugar((Function)‘function(...) { }‘) 28 * ⇒ (Function)‘function() { 29 ’ 30 ’ 31 ’}‘ 32 when Statement initBody := spreadParameter( rest, 0, generateUId );

104 33 34 @ensureUnchanged{initBody} 35 Function functionSugar((Function)‘function (<{Param ","}* ps>, ...) { }‘) 36 * ⇒ (Function)‘function ( <{Param ","}* ps> ) { 37 ’ 38 ’ 39 ’}‘ 40 when Statement initBody := spreadParameter( rest, size( (Params)‘<{Param ","}* ps>‘ ), generateUId ); 41 42 @ensureUnchanged{initBody} 43 Function functionSugar((Function)‘function (<{Param ","}* ps>, ...) { < Statement* body> }‘) 44 * ⇒ (Function)‘function( <{Param ","}* ps> ) { 45 ’ 46 ’ 47 ’}‘ 48 when Statement initBody := spreadParameter( rest, size( (Params)‘<{Param ","}* ps>‘ ), generateUId ); 49 50 @fixedLength{bef} 51 @fixedLength{rest} 52 @ensureUnchanged{initBody} 53 Function functionSugar((Function)‘function (<{Param ","}* bef>, = , <{Param ","}* rest>) { }‘) 54 * ⇒ (Function)‘function (<{Param ","}* bef>, , <{Param ","}* rest>) { 55 ’ 56 ’ 57 ’}‘ 58 when Statement initBody := defaultParameter( pr, defVal, size((Params)‘<{Param ","}* bef>‘) ); 59 60 @fixedLength{bef} 61 @fixedLength{rest} 62 @ensureUnchanged{initBody} 63 Function functionSugar((Function)‘function (<{Param ","}* bef>, = < Expression defVal>, <{Param ","}* rest>) { }‘) 64 * ⇒ (Function)‘function(<{Param ","}* bef>, , <{Param ","}* rest>) { 65 ’ 66 ’ 67 ’}‘ 68 when Statement initBody := defaultParameter( pr, defVal, size((Params)‘<{Param ","}* bef>‘) ); 69 70 private int size((Params)‘<{Param ","}* ps>‘) = (0 | it + 1 | _ ← ps); 71 72 Source desugar(Source src) { 73 generateUId = generateNamer(); 74 return desugar-all; 75 } 76 Source desugar_resugar(Source src) = resugar-all;

105 77 78 private Statement spreadParameter( Id param, int pos, Id(str) generateUId ) 79 = restInit 80 when 81 Id len := generateUId("_len"), 82 Expression key := [Expression]"", 83 Statement restInit := (Statement)‘for (var = arguments.length, = Array( \> 1 ? - 1 :0 ), _key = ; _key \< ; _key++) { 84 ’ [_key - 1] = arguments[_key]; 85 ’ }‘; 86 87 private Statement defaultParameter( Id param, Expression defVal, int pos ) 88 = defInit 89 when 90 Expression pos := [Expression]"", 91 Statement defInit := (Statement)‘var = arguments[] === undefined ? :arguments[];‘;

D.2 Performance Benchmark

1 module performance::benchmark::PerformanceBenchmark 2 3 syntax Exp = "a" Exp | "b" Exp | "."; 4 5 Exp sugar1((Exp)‘a‘) ⇒ (Exp)‘b‘; 6 Exp sugar4((Exp)‘a‘) ⇒ (Exp)‘bbbb‘; 7 Exp sugar16((Exp)‘a‘) ⇒ (Exp)‘bbbbbbbbbbbbbbbb‘; 8 Exp sugar64((Exp)‘a‘) ⇒ (Exp)‘ bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb‘; 9 10 Exp sugarInt1((Exp)‘a‘) * ⇒ (Exp)‘b‘; 11 Exp sugarInt4((Exp)‘a‘) * ⇒ (Exp)‘bbbb‘; 12 Exp sugarInt16((Exp)‘a‘) * ⇒ (Exp)‘bbbbbbbbbbbbbbbb‘; 13 Exp sugarInt64((Exp)‘a‘) * ⇒ (Exp)‘ bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb‘; 14 15 resugarable node; 16 17 private void testCompositional(Exp exp) { 18 if (resugar-all> != exp) 19 throw "Unequivalent terms!"; 20 if (resugar-all> != exp) 21 throw "Unequivalent terms!"; 22 if (resugar-all> != exp) 23 throw "Unequivalent terms!"; 24 if (resugar-all> != exp) 25 throw "Unequivalent terms!"; 26 } 27 28 private void testIntermediate(Exp exp) { 29 if (resugar-all> != exp)

106 30 throw "Unequivalent terms!"; 31 if (resugar-all> != exp) 32 throw "Unequivalent terms!"; 33 if (resugar-all> != exp) 34 throw "Unequivalent terms!"; 35 if (resugar-all> != exp) 36 throw "Unequivalent terms!"; 37 } 38 39 public void testCompositional() { 40 testCompositional((Exp)‘a.‘); 41 testCompositional((Exp)‘aaaa.‘); 42 testCompositional((Exp)‘aaaaaaaaaaaaaaaa.‘); 43 testCompositional((Exp)‘aaaaaaaaaaaaaaaaaaaaaaaa.‘); 44 } 45 public void testIntermediate() { 46 testIntermediate((Exp)‘a.‘); 47 testIntermediate((Exp)‘aaaa.‘); 48 testIntermediate((Exp)‘aaaaaaaaaaaaaaaa.‘); 49 testIntermediate((Exp)‘aaaaaaaaaaaaaaaaaaaaaaaa.‘); 50 }

D.3 Banana Fun Language Performance Benchmark

D.3.1 Syntax

1 module performance::language::Syntax 2 3 extend lang::std::Layout; 4 5 keyword ReservedKeywords = "car" | "cdr" "boolean" | "integer" | "negative" | " null" | "pair" | "positive" | "procedure" | "zero" | "cons" 6 | "pair" | "let" | "letrec" | "null" | "in"; 7 8 lexical ShortId = [a-zA-Z]; 9 lexical Id = ([A-Za-z_] !<< [A-Z_a-z] [0-9A-Z_a-z]* !>> [0-9 A-Z _ a-z]) \ ReservedKeywords; // Borrowed from Rascal’s syntax source code. 10 11 syntax AndExp = and: AndExp "&" NotExp ; 12 syntax AndExp = notexp: NotExp ; 13 syntax Digit = eight: "8" ; 14 syntax Digit = five: "5" ; 15 syntax Digit = four: "4" ; 16 syntax Digit = nine: "9" ; 17 syntax Digit = \one: "1" ; 18 syntax Digit = seven: "7" ; 19 syntax Digit = six: "6" ; 20 syntax Digit = three: "3" ; 21 syntax Digit = two: "2" ; 22 syntax Digit = zero: "0" ; 23 syntax Exp = letexp: LetExp ; 24 syntax FacNoPar = app1: FacNoPar Prim ;

107 25 syntax FacNoPar = app2: FacParen Prim ; 26 syntax FacNoPar = car: "car" Prim ; 27 syntax FacNoPar = cdr: "cdr" Prim ; 28 syntax FacNoPar = isbool: "boolean?" Prim ; 29 syntax FacNoPar = isint: "integer?" Prim ; 30 syntax FacNoPar = isneg: "negative?" Prim ; 31 syntax FacNoPar = isnull: "null?" Prim ; 32 syntax FacNoPar = ispair: "pair?" Prim ; 33 syntax FacNoPar = ispos: "positive?" Prim ; 34 syntax FacNoPar = isproc: "procedure?" Prim ; 35 syntax FacNoPar = iszero: "zero?" Prim ; 36 syntax FacNoPar = pred: "--" Prim ; 37 syntax FacNoPar = prim: Prim ; 38 syntax FacNoPar = succ: "++" Prim ; 39 syntax FacParen = app: Factor "(" LetExp ")" ; 40 syntax FacParen = car: "car" "(" LetExp ")" ; 41 syntax FacParen = cdr: "cdr" "(" LetExp ")" ; 42 syntax FacParen = cons: "cons" "(" LetExp "," LetExp ")" ; 43 syntax FacParen = isbool: "boolean?" "(" LetExp ")" ; 44 syntax FacParen = isint: "integer?" "(" LetExp ")" ; 45 syntax FacParen = isneg: "negative?" "(" LetExp ")" ; 46 syntax FacParen = isnull: "null?" "(" LetExp ")" ; 47 syntax FacParen = ispair: "pair?" "(" LetExp ")" ; 48 syntax FacParen = ispos: "positive?" "(" LetExp ")" ; 49 syntax FacParen = isproc: "procedure?" "(" LetExp ")" ; 50 syntax FacParen = iszero: "zero?" "(" LetExp ")" ; 51 syntax FacParen = paren: "(" LetExp ")" ; 52 syntax FacParen = pred: "--" "(" LetExp ")" ; 53 syntax FacParen = succ: "++" "(" LetExp ")" ; 54 syntax Factor = facparen: FacParen ; 55 syntax Factor = noparen: FacNoPar ; 56 syntax IntConst = digit: Digit ; 57 syntax IntConst = more: IntConst Digit ; 58 syntax LetExp = \if: OrExp "?" LetExp ":" LetExp ; 59 syntax LetExp = letfun: "let" Id "(" Id ")" "=" LetExp "in" LetExp ; 60 syntax LetExp = letrec: "letrec" Id "(" Id ")" "=" LetExp "in" LetExp ; 61 syntax LetExp = letvar: "let" Id "=" LetExp "in" LetExp ; 62 syntax LetExp = orexp: OrExp ; 63 syntax NotExp = not: "!" RelExp ; 64 syntax NotExp = relexp: RelExp ; 65 syntax OrExp = andexp: AndExp ; 66 syntax OrExp = or: OrExp "|" AndExp ; 67 syntax OrExp = xor: OrExp "^" AndExp ; 68 syntax Prim = \false: "#f" ; 69 syntax Prim = intConst: IntConst ; 70 syntax Prim = null: "#null" ; 71 syntax Prim = \true: "#t" ; 72 syntax Prim = var: Id ; 73 syntax RelExp = eq: SimpleExp "=" SimpleExp ; 74 syntax RelExp = gt: SimpleExp "\>" SimpleExp ; 75 syntax RelExp = gte: SimpleExp "\>=" SimpleExp ; 76 syntax RelExp = lt: SimpleExp "\<" SimpleExp ; 77 syntax RelExp = lte: SimpleExp "\<=" SimpleExp ;

108 78 syntax RelExp = neq: SimpleExp "!=" SimpleExp ; 79 syntax RelExp = simple: SimpleExp ; 80 syntax SimpleExp = add: SimpleExp "+" Term ; 81 syntax SimpleExp = sub: SimpleExp "-" Term ; 82 syntax SimpleExp = term: Term ; 83 syntax SimpleExp = uminus: "-" Term ; 84 syntax SimpleExp = uplus: "+" Term ; 85 syntax Term = factor: Factor ; 86 syntax Term = mul: Term "*" Factor ; 87 88 89 syntax Body = exp: "." Exp!letexp e; 90 syntax Body = lam: Name n Body b; 91 syntax Exp = app: Exp!letexp e Short s; 92 syntax Exp = short: Short s; 93 syntax Name = id: "\<" Id lId "\>" ; 94 syntax Name = shortid: ShortId sId; 95 syntax Short = lam: "\\" Name n Body b; 96 syntax Short = paren: "(" Exp!letexp e ")" ; 97 syntax Short = var: Name n;

D.4 Fun to Lambda Calculus

D.4.1 Syntax

1 module fun::Syntax 2 3 import Prelude; 4 import String; 5 6 layout Whitespace = [\t-\n\r\ ]*; 7 8 keyword ReservedKeywords 9 = "true" | "false" | "succ" | "pred" | "iszero" | "plus" | "mult" | "not" | " and" | "or" | "pair" 10 | "first" | "second" | "cons" | "head" | "tail" | "if" | "let" | "in" | "letrec " 11 | "else" | "let*" | "apply" | "Y"; 12 13 // Id Borrowed from Rascal’s syntax source code. 14 lexical Id = ([A-Za-z_] !<< [A-Z_a-z] [0-9A-Z_a-z]* !>> [0-9 A-Z _ a-z]) \ ReservedKeywords; 15 lexical Integer = [0-9] !<< [0-9]+ !>> [0-9]; 16 17 syntax FunctionCallInner 18 = FunctionCallInner functionCallInner "," Exp exp 19 | Id id "(" Exp exp; 20 21 syntax LetIdInner 22 = Id id "," LetIdInner 23 | Id id ")" "=" Exp e; 24

109 25 syntax LetRecIdInner 26 = Id id "," LetRecIdInner 27 | Id id ")" "=" Exp e; 28 29 start syntax Exp 30 = bracket "(" Exp ")" 31 | Integer 32 | "true" | "false" 33 | Id 34 | "(" Exp e ")" 35 | "succ" "(" Exp e ")" 36 | "pred" "(" Exp e ")" 37 | "iszero" "(" Exp e ")" 38 | "plus" "(" Exp e1 "," Exp e2 ")" 39 | "mult" "(" Exp e1 "," Exp e2 ")" 40 | "not" "(" Exp e ")" 41 | "and" "(" Exp e1 "," Exp e2 ")" 42 | "or" "(" Exp e1 "," Exp e2 ")" 43 | "pair" "(" Exp e1 "," Exp e2 ")" 44 | "first" "(" Exp e ")" 45 | "second" "(" Exp e ")" 46 | "cons" "(" Exp e1 "," Exp e2 ")" 47 | "head" "(" Exp e ")" 48 | "tail" "(" Exp e ")" 49 | "if" "(" Exp e1 ")" Exp e2 "else" Exp e3 50 | FunctionCallInner functionCallInner ")" 51 | "let" Id id "=" Exp e1 "in" Exp e2 52 | "let*" Id id "=" Exp e1 "in" Exp e2 53 | "let" Id "(" LetIdInner letIdInner "in" Exp e2 54 | "let*" Id "(" LetIdInner letIdInner "in" Exp e2 55 | "letrec" Id "(" LetRecIdInner letIdInner "in" Exp e2 56 | "apply" "(" Exp e1 "," Exp e2 ")" 57 | "Y" "(" Exp e ")" 58 > func: " " Id "." Exp 59 ; 60 61 resugarable node;

D.4.2 Sugar

1 module fun::Sugar 2 3 extend fun::CustomSugar; 4 5 import fun::LambdaData; 6 import fun::Syntax; 7 import Prelude; 8 import String; 9 10 11 LambdaData sugar((Exp)‘()‘) ⇒ 12 LambdaData e; 13

110 14 15 LambdaData sugar((Exp)‘ . ‘) ⇒ 16 lambda(NameData id, LambdaData exp); 17 18 19 // ⇒ (Lambda)‘\\xy.x‘; 20 LambdaData sugar((Exp)‘true‘) catch MaybeBoolean 21 ⇒ lambda(x, lambda(y, var(x))) 22 when x := shName("x"), y := shName("y"); 23 24 // ⇒ (Lambda)‘\\xy.y‘; 25 LambdaData sugar((Exp)‘false‘) catch MaybeBoolean 26 ⇒ lambda(x, lambda(y, var(y))) 27 when x := shName("x"), y := shName("y"); 28 29 // ⇒ (Lambda)‘(\\nfx.f(nfx))()‘; 30 LambdaData sugar((Exp)‘succ ()‘) throws MaybeInteger 31 ⇒ app( 32 lambda(n, 33 lambda(f, 34 lambda(x, 35 app(var(f), app(app(var(n), var(f)), var(x)))))), 36 LambdaData e) 37 when f := shName("f"), x := shName("x"), n := name("n"); 38 39 // ⇒ (Lambda)‘(\\nfx.n(\\gh.h(gf))(\\u.x)(\\u.u))()‘ 40 LambdaData sugar((Exp)‘pred ()‘) throws MaybeInteger 41 ⇒ app( 42 lambda(n, lambda(f, lambda(x, 43 app(app(app(var(n),lambda(g, lambda(h, app(var(h), app(var(g), var(f)))))), 44 lambda(u, var(x))), 45 lambda(u, var(u)))))), LambdaData e) 46 when f := shName("f"), x := shName("x"), u := shName("u"), n := name("n"), g := name("g"), h := name("h"); 47 48 // ⇒ (Lambda)‘(\\n.n(\\x.(\\xy.y))(\\xy.x))()‘; 49 LambdaData sugar((Exp)‘iszero ()‘) throws MaybeBoolean 50 ⇒ app( 51 lambda( 52 n, 53 app( 54 app( 55 var(n), 56 lambda( 57 x, 58 lambda( 59 x, 60 lambda( 61 y, var(y))))), 62 lambda( 63 x, 64 lambda( 65 y,

111 66 var(x))))), LambdaData e) 67 when x := shName("x"), y := shName("y"), n := name("n"); 68 69 // ⇒ (Lambda)‘(\\mnfx.mf(nfx))()()‘; 70 LambdaData sugar((Exp)‘plus (, )‘) throws MaybeInteger 71 ⇒ app(app( 72 lambda(m, 73 lambda(n, 74 lambda(f, 75 lambda(x, 76 app( 77 app( 78 var(m), 79 var(f)), 80 app( 81 app( 82 var(n), 83 var(f)), 84 var(x))))))), LambdaData e1), LambdaData e2) 85 when f := shName("f"), x := shName("x"), m := name("m"), n := name("n"); 86 87 // ⇒ (Lambda)‘(\\mnf.n(mf))()()‘; 88 LambdaData sugar((Exp)‘mult (, )‘) throws MaybeInteger 89 ⇒ app(app( 90 lambda(m, 91 lambda(n, 92 lambda(f, 93 app( 94 var(n), 95 app( 96 var(m), 97 var(f)))))), LambdaData e1), LambdaData e2) 98 when f := shName("f"), m := name("m"), n := name("n"); 99 100 // ⇒ (Lambda)‘(\\x.x(\\xy.y)(\\xy.x))()‘; 101 LambdaData sugar((Exp)‘not ()‘) throws MaybeBoolean 102 ⇒ app( 103 lambda(x, 104 app( 105 app( 106 var(x), 107 lambda(x, 108 lambda(y, 109 var(y)))), 110 lambda(x, 111 lambda(y, 112 var(x))))), LambdaData e) 113 when x := shName("x"), y := shName("y"); 114 115 // ⇒ (Lambda)‘(\\xy.xy(\\xy.y))()()‘; 116 LambdaData sugar((Exp)‘and (, )‘) throws MaybeBoolean 117 ⇒ app(app(lambda(x, 118 lambda(y,

112 119 app( 120 app( 121 var(x), 122 var(y)), 123 lambda(x, 124 lambda(y, 125 var(y)))))), LambdaData e1), LambdaData e2) 126 when x := shName("x"), y := shName("y"); 127 128 // ⇒ (Lambda)‘(\\xy.x(\\xy.x)y)()()‘; 129 LambdaData sugar((Exp)‘or (, )‘) throws MaybeBoolean 130 ⇒ app(app(lambda(id("x"), 131 lambda(y, 132 app( 133 app( 134 var(x), 135 lambda(x, 136 lambda(y, 137 var(x)))), 138 var(y)))), LambdaData e1), LambdaData e2) 139 when x := shName("x"), y := shName("y"); 140 141 // ⇒ (Lambda)‘(\\abx.xab)()()‘; 142 LambdaData sugar((Exp)‘pair (, )‘) 143 ⇒ lambda(uniqueName, app(app(var(uniqueName), LambdaData e1), LambdaData e2)) 144 when uniqueName := name("pair"); 145 146 // ⇒ (Lambda)‘(\\abx.xab)()()‘; 147 LambdaData sugar((Exp)‘cons (, )‘) 148 * ⇒ (Exp)‘pair (, )‘; 149 150 // ⇒ (Lambda)‘(\\p.p(\\xy.x))()‘; 151 LambdaData sugar((Exp)‘first ()‘) 152 ⇒ app1( 153 lambda(p, 154 app1( 155 var(p), 156 lambda(x, 157 lambda(y, 158 var(x))))), 159 LambdaData e) 160 when p := shName("p"), x := shName("x"), y := shName("y"); 161 162 // ⇒ (Lambda)‘(\\p.p(\\xy.y))()‘; 163 LambdaData sugar((Exp)‘second ()‘) 164 ⇒ app1(lambda(p, 165 app1( 166 var(p), 167 lambda(x, 168 lambda(y, 169 var(y))))), LambdaData e) 170 when p := shName("p"), x := shName("x"), y := shName("y"); 171

113 172 LambdaData sugar((Exp)‘head ()‘) 173 * ⇒ (Exp)‘first ()‘; 174 175 LambdaData sugar((Exp)‘tail ()‘) 176 * ⇒ (Exp)‘second ()‘; 177 178 // ⇒ (Lambda)‘()()()‘; 179 LambdaData sugar((Exp)‘if () else ‘) 180 ⇒ app1(app1(LambdaData e1, LambdaData e2), LambdaData e3); 181 182 // ⇒ (Lambda)‘(\\\<\>.())()‘; 183 LambdaData sugar((Exp)‘let = in ‘) 184 ⇒ app1(lambda(NameData idd, LambdaData e2), LambdaData e1); 185 186 LambdaData sugar((Exp)‘let* = in ‘) 187 ⇒ app1(lambda(NameData idd, LambdaData e2), LambdaData e1); 188 189 NameData sugar((Id)‘‘) { 190 return id("_")[@__resugarFunction=(Id) (id(str _id)) { 191 return parse(#Id, substring(_id, 1)); 192 }]; 193 } 194 195 LambdaData sugar((Exp)‘‘) { 196 return var(id("_"))[@__resugarFunction=(Exp) (var(id(str _id))) { 197 return parse(#Exp, substring(_id, 1)); 198 }]; 199 } 200 201 LambdaData sugar((Exp)‘Y()‘) 202 ⇒ app(app( 203 lambda(id("x"), lambda(id("y"), app(var(id("y")), app(app(var(id("x")), var( id("x"))), var(id("y")))))), 204 lambda(id("x"), lambda(id("y"), app(var(id("y")), app(app(var(id("x")), var( id("x"))), var(id("y"))))))), 205 LambdaData e); 206 207 LambdaData sugar((FunctionCallInner)‘ (‘) 208 ⇒ app1(var(NameData id), LambdaData exp); 209 210 LambdaData sugar((FunctionCallInner)‘, ‘) 211 ⇒ app1(LambdaData functionCallInner, LambdaData exp); 212 213 LambdaData sugar((Exp)‘)‘) 214 ⇒ LambdaData functionCallInner; 215 216 LambdaData sugar((LetIdInner)‘) = ‘) 217 ⇒ lambda(NameData id, LambdaData e); 218 219 220 LambdaData sugar((LetIdInner)‘, ‘) 221 ⇒ lambda(NameData id, LambdaData letIdInner);

114 222 223 224 LambdaData sugar((LetRecIdInner)‘) = ‘) 225 ⇒ lambda(NameData idd, LambdaData e); 226 227 LambdaData sugar((LetRecIdInner)‘, ‘) 228 ⇒ lambda(NameData idd, LambdaData letIdInner); 229 230 LambdaData sugar((Exp)‘let ( in ‘) 231 ⇒ app1(lambda(NameData id, LambdaData e), LambdaData letIdInner); 232 233 LambdaData sugar((Exp)‘let* ( in ‘) 234 ⇒ app1(lambda(NameData id, LambdaData e), LambdaData letIdInner); 235 236 LambdaData sugar((Exp)‘letrec ( in ‘) 237 ⇒ app1(lambda(NameData v1, LambdaData v4), 238 app(app( 239 lambda(id("x"), lambda(id("y"), app(var(id("y")), app(app(var(id("x")), var( id("x"))), var(id("y")))))), 240 lambda(id("x"), lambda(id("y"), app(var(id("y")), app(app(var(id("x")), var( id("x"))), var(id("y"))))))), 241 lambda(v1, LambdaData v23))); 242 243 LambdaData sugar((Exp)‘apply(, )‘) 244 ⇒ app1(LambdaData e1, LambdaData e2);

D.4.3 CustomSugar

1 module fun::CustomSugar 2 3 import Prelude; 4 import fun::Syntax; 5 import fun::LambdaData; 6 7 map[str, int] alphaMap = (); 8 9 private NameData name(str prefix) { 10 return id(prefix); 11 if (prefix notin alphaMap) { 12 alphaMap[prefix] = 0; 13 } 14 alphaMap[prefix] = alphaMap[prefix] + 1; 15 return id(""); 16 } 17

18 19 // Shared name... 20 // Do not alter without altering CustomSugar. 21 private NameData shName(str prefix) { 22 return id(""); 23 } 24 25 Exp sugar("MaybeInteger", lambda(id("f"), lambda(id("x"), LambdaData fsx))) {

115 26 if (canUnFsx(fsx)) { 27 Integer iResult = parse(#Integer, ""); 28 return (Exp)‘‘; 29 } 30 fail; 31 } 32 33 bool canUnFsx(app(var(id(/^f[0-9]*/)), LambdaData inner)) = canUnFsx(inner); 34 bool canUnFsx(var(id(/^x[0-9]*/))) = true; 35 default bool canUnFsx(_) = false; 36 37 int unFsx(app(var(id(/^f[0-9]*/)), LambdaData inner)) = 1 + unFsx(inner); 38 int unFsx(var(id(/^x[0-9]*/))) = 0; 39 40 LambdaData createFsx(NameData f, NameData x, 0) = var(x); 41 default LambdaData createFsx(NameData f, NameData x, int n) = app(var(f), e) 42 when e := createFsx(f, x, n - 1); 43 44 LambdaData sugar(original:(Exp)‘‘) { 45 NameData uF = id("f"); 46 NameData uX = id("x"); 47 LambdaData fs = lambda(uF, lambda(uX, createFsx(uF, uX, toInt("")))); 48 return fs[@__resugarFunction=(Exp) (lambda(id(/^f[0-9]*/), lambda(id(/^x [0-9]*/), LambdaData fsx))) { 49 if (canUnFsx(fsx)) { 50 Integer iResult = parse(#Integer, ""); 51 return setAnnotations((Exp)‘‘, getAnnotations(original)); 52 } 53 fail; 54 }]; 55 }

D.4.4 LambdaData

1 module fun::LambdaData 2 3 // It is necessary to store id under a node since 4 // strings cannot be annotated. 5 data NameData = id(str id); 6 7 data LambdaData 8 = var(NameData n) 9 | app(LambdaData l1, LambdaData l2) 10 | app1(LambdaData l1, LambdaData l2) 11 | lambda(NameData name, LambdaData lambdaData) 12 ;

D.4.5 Evaluator

1 module fun::Evaluator 2 3 import fun::Sugar;

116 4 import fun::LambdaData; 5 import Prelude; 6 import String; 7 import Node; 8 import String; 9 extend fun::Syntax; 10 11 anno int NameData@alpha; 12 13 bool inSurfaceLanguage(node n) { 14 visit(n) { 15 case node n: 16 if ("__nodeId" in getAnnotations(n)) { 17 return false; 18 } 19 } 20 return true; 21 } 22 23 &D<:node setAnno(str name, node annoSource, &D<:node applyOnto) { 24 try { 25 map[str, value] annos = getAnnotations(annoSource); 26 if (name in annos) { 27 return setAnnotations(applyOnto, getAnnotations(applyOnto) + (name: annos[ name])); 28 } 29 return applyOnto; 30 } catch value v: { 31 return applyOnto; 32 } 33 } 34 35 LambdaData apply(NameData n:id(name), LambdaData l1, LambdaData l2, set[NameData ] bound) { 36 return top-down-break visit(l1) { 37 case var(n2:id(name2)) ⇒ l2 38 when name == name2 && n@alpha == n2@alpha 39 case ll:lambda(n2:id(name2)) ⇒ ll 40 when name == name2 && n@alpha == n2@alpha 41 } 42 } 43 44 map[str, int] globalBindings = (); 45 int bind(str v) { 46 if (v in globalBindings) { 47 globalBindings[v] = globalBindings[v] + 1; 48 } else { 49 globalBindings[v] = 1; 50 } 51 return globalBindings[v]; 52 } 53 54 int sBound(bound, v) {

117 55 return v in bound ? bound[v] :0; 56 } 57 58 LambdaData alpha(LambdaData a, map[str, int] bound) { 59 return top-down-break visit(a) { 60 case original:id(v) ⇒ 61 original[@alpha=w] 62 when w := sBound(bound, v) 63 case original:lambda(oid:id(v), rest) ⇒ 64 setAnnos(original, lambda( 65 setAnnos(oid, id("")[@alpha=w]), 66 alpha(rest, bound + (v: w)))) 67 when w := bind(v) 68 } 69 } 70 71 &D<:node setAnnos(node annoSource, &D<:node applyOnto) { 72 return 73 setAnno("__nodeId", annoSource, 74 setAnno("__resugarFunction", annoSource, applyOnto)); 75 } 76 77 LambdaData applyRedex(org:app1(lambda(NameData x, LambdaData y), LambdaData z), set[NameData] bound) 78 = alpha(apply(x, y, z, bound), ()) 79 ; 80 81 LambdaData applyRedex(org:app(lambda(NameData x, LambdaData y), LambdaData z), set[NameData] bound) 82 = setAnnos(org, alpha(apply(x, y, z, bound), ())) 83 ; 84 85 default LambdaData applyRedex(a:app(LambdaData x, LambdaData y), set[NameData] bound) { 86 try { 87 return setAnnos(a, app(applyRedex(x, bound), y)); 88 } catch value error: { 89 return setAnnos(a, app(x, applyRedex(y, bound))); 90 } 91 } 92 93 default LambdaData applyRedex(a:app1(LambdaData x, LambdaData y), set[NameData] bound) { 94 try { 95 return setAnnos(a, app1(applyRedex(x, bound), y)); 96 } catch value error: { 97 return setAnnos(a, app1(x, applyRedex(y, bound))); 98 } 99 } 100 101 default LambdaData applyRedex(doubleo:lambda(x,y), set[NameData] bound) { 102 return setAnnos(doubleo, lambda(x, applyRedex(y, bound))); 103 }

118 104 105 default LambdaData applyRedex(v:var(_), set[NameData] bound) { 106 throw "no changes"; 107 } 108 109 LambdaData stepLambda(a:app(LambdaData x, LambdaData y), set[NameData] bound) = applyRedex(setAnnos(a, app(x, y)), bound); 110 LambdaData stepLambda(a:app1(LambdaData x, LambdaData y), set[NameData] bound) = applyRedex(setAnnos(a, app1(x, y)), bound); 111 LambdaData stepLambda(original:var(id(str id)), set[NameData] bound) = original; 112 LambdaData stepLambda(original:lambda(NameData x, LambdaData y), set[NameData] bound) = setAnnos(original, lambda(x,stepLambda(y, bound))); 113 114 str printLambda(i:id(str idStr)) = size(idStr) > 1 ? "\<\>" :""; 115 str printLambda(var(NameData n)) = printLambda(n); 116 str printLambda(app(LambdaData e1, LambdaData e2)) = "( < printLambda(e2)>)"; 117 str printLambda(app1(LambdaData e1, LambdaData e2)) = "( < printLambda(e2)>)"; 118 str printLambda(lambda(NameData e1, LambdaData e2)) = "(.< printLambda(e2)>)"; 119 120 121 bool alphaEq(i1:id(str idStr), i2:id(str idStr)) = i1@alpha == i2@alpha; 122 bool alphaEq(var(NameData n1), var(NameData n2)) = alphaEq(n1, n2); 123 bool alphaEq(app(LambdaData e1, LambdaData e2), app(LambdaData _e1, LambdaData _e2)) 124 = alphaEq(e1, _e1) && alphaEq(e2, _e2); 125 bool alphaEq(app1(LambdaData e1, LambdaData e2), app1(LambdaData _e1, LambdaData _e2)) 126 = alphaEq(e1, _e1) && alphaEq(e2, _e2); 127 bool alphaEq(lambda(NameData e1, LambdaData e2), lambda(NameData _e1, LambdaData _e2)) 128 = alphaEq(e1, _e1) && alphaEq(e2, _e2); 129 130 public void evalShowSteps(Exp e) { 131 LambdaData l = desugar-all; 132 evalShowSteps(l); 133 } 134 135 public void evalShowSteps(LambdaData l) { 136 int step = 0; 137 globalBindings = (); 138 l = alpha(l,()); 139 LambdaData l0; 140 do { 141 l0 = l; 142 value unexpand = resugar-all; 143 if (inSurfaceLanguage(unexpand)) { 144 println(unexpand); 145 } 146 try { 147 l = stepLambda(l, {});

119 148 } catch value v: { 149 break; 150 } 151 step = step + 1; 152 } while(!(l0 == l && alphaEq(l0, l))); 153 println("Total # of steps:"); 154 println(step); 155 println("AST:"); 156 println(l); 157 println("Core language representation:"); 158 println(printLambda(l)); 159 }

120