Towards a Language Parametric World: The Language Parametric Read-Eval-Print Loop

Author: Jeroen Lappenschaar Supervisor: Paul Klint

Amsterdam, June 16, 2014 Abstract

The Read-Eval-Print Loop (REPL) has proven itself to be a useful tool for software developers. New languages, especially DSLs, often don’t have these dedicated tools. This thesis researches whether a REPL can be parametrized for any language, and if so, how it should be instantiated. This is done in three parts: (1) An analysis is made of the features in a REPL and its language dependencies, (2) an overview of 13 popular REPLs and their features is given, and finally (3) the results of a proof-of-concept implementation are discussed. The thesis contributes with a domain analysis about the REPL and concludes with pointers on the parametrization of languages and their features. Contents

1 Introduction 3 1.1 Research ...... 4 1.1.1 Language-parametric REPL ...... 4 1.1.2 Exploratory research ...... 5 1.2 Set-up ...... 5

2 Background 6 2.1 An introduction ...... 6 2.2 The definition ...... 7 2.3 Implementation ...... 8 2.4 Related topics ...... 9 2.4.1 Language workbenches ...... 9 2.4.2 Reusable REPLs ...... 9 2.5 Summary ...... 10

3 A REPL Comparison 11 3.1 Hands-on ...... 11 3.2 Selection ...... 12 3.2.1 REPL Selection ...... 12 3.2.2 Feature Selection ...... 12 3.3 REPL feature overview ...... 13 3.4 Results ...... 17

4 Parametrization 19 4.1 REPL Parametrization ...... 19 4.2 Feature parametrization ...... 19 4.2.1 Available information ...... 20 4.2.2 Statement Actions ...... 20 4.2.3 Statement Features ...... 23 4.2.4 Information Features ...... 25 4.2.5 Session Actions ...... 28 4.2.6 Code-file Actions ...... 29 4.3 Dependency Conclusion ...... 31 4.4 Feature Relations Conclusion ...... 33

5 A LP-REPL Implementation 34 5.1 Environment set-up ...... 34 5.1.1 Java ...... 34 5.1.2 Rascal ...... 34

1 5.1.3 Evaluator integration ...... 35 5.2 The Implementation ...... 35 5.2.1 Metrics ...... 37 5.2.2 Architecture ...... 37 5.2.3 Basic REPL ...... 38 5.2.4 Features ...... 39 5.3 Resulting interfaces ...... 40 5.3.1 Command interface ...... 40 5.3.2 Java Evaluators ...... 42 5.3.3 Rascal DSL Evaluator ...... 42 5.3.4 Interface analysis ...... 43 5.4 Summary ...... 43

6 Conclusion 44

Bibliography 48

A Choice of REPLs 49

B REPL hands-on experience 50

C Java Interface 54

2 Chapter 1

Introduction

Software development tools haven proven their worth from the day they were first created. They make problems more insightful (e.g., debuggers, profilers), they save developers from mind robbing tasks (e.g., automatic builders, unit testing), help or save time in any other matter (e.g., bug databases, code generation, code formatting). There are many more tools to name, and as many reasons for using them. Some of these tools can be used for any (e.g., code revision, bug databases) but most of them are dedicated to a single language. With the development of a language the development of these tools often quickly follows. They provide an easier introduction to new developers and pave way for the adoption of the language. The development of these tools takes time however. Specifically for the development of Domain Specific Languages (DSLs) this is not desirable. These tools are meant to be quickly developed and quickly put to use. It is not desirable to keep creating dedicated development tools for every developed language. Language workbenches have been developed to solve exactly this problem. They offer the developer an IDE that directly integrates with the language that is being developed. Most of these workbenches already offer integrated features like syntax highlighting, code folding, error marking and refactoring [7]. A tool that has not yet been added to any of these language workbenches in a parametrized fashion is the Read-Eval-Print Loop (REPL). A REPL is often described as an interactive console where one can run code line by line and instantaneously see its result. It is the console that most people will know from languages like Haskell, Scala, Python or any of the LISP dialects. Figure 1.1 shows an example of a typical REPL.

Figure 1.1: An example of a typical REPL. In this case we see a Linux terminal where the default Ruby REPL (irb) is running. The already run code shows how a user can test statements, create variables and call library functions.

3 The main advantage of a REPL is the direct interaction that it offers with code. To test just a specific function, a developer often creates a special main function, creates help functions and/or comments out any part of the code. The REPL replaces this necessity by facilitating access to any code in development and interaction with that code. This helps developers in the following cases:

• Testing and debugging: One can quickly test any function with any parameter1.

• Learning the language: The REPL instantaneously displays the result of the entered code, which is ideal for users who want to explore or experiment with the language.

In short: “REPLs facilitate exploratory programming and debugging because the read-eval-print loop is usually much faster than the classic edit-compile-run-debug cycle.” [34].

1.1 Research

So far we have seen the need for multi-language tools and a language specific tool, the REPL. A combination of this, a REPL that can be parametrized for different languages, would bring all the benefits of the REPL to any language. This would especially benefit new developed languages who need to be tested and learned by its users. To examine this, the research topic of this thesis will be:

How can we design a language-parametric REPL, and how can it be instantiated for different languages?

The purpose of this research is twofold:

1. To give an example implementation of a language parametrized REPL.

2. Let this be an explorative research for language parametrization of development tools in the of DSLs.

Both purposes will be discussed next.

1.1.1 Language-parametric REPL Creating a language-parametric REPL will make all advantages of a REPL easily available for all languages. Next to the already mentioned advantages of a REPL this has an added benefit in some cases:

• First, it supports language development. The language developer often has to test simple statements to see if everything (syntax, grammar, parser, /) is working as hoped. As we have seen, the REPL is an ideal tool for this job.

• Secondly, a new language needs to be learned by the users. In the case of DSLs there are higher chances that these people have no previous experience with programming. The experimental and explorative benefits of the REPL will help them here.

• Lastly, the REPL offers a user-friendly interface for the new language that can instantly be accessed. 1Some REPLs even have the option to control the debugger from the console.

4 1.1.2 Exploratory research The second purpose of this thesis is to see how tools in general, can be made language- parametric. Information gathered here could be used to improve on existing language work- benches. The REPL, with all its features, affects many aspects of a language, and the analysis about this can give insight into:

• What parts of a tool or feature are language specific.

• How can the necessary information from the languages be shared.

1.2 Set-up

To answer the research question, a domain analysis of the REPL is performed. This is done by performing background research in the literature (Chapter 2) and a thorough analysis of the most popular or outstanding REPLs (Chapter 3). It turns out that a big part of a REPL is formed by its many possible features. We defined these features and analysed how they could be parametrised for different languages (Chapter 4). All gathered information and analyses are then used to create a proof-of-concept implementation (Chapter 5) that will show a possible interface for a language-parametrized REPL. The gathered results show that in order to create a full-featured language-parametrized REPL one needs a lot of additional language-specific information. This information is however already often available in language workbenches and such a REPL could therefore be easily added to any existing language workbench.

5 Chapter 2

Background

2.1 An introduction

The REPL has long been recognized as a useful tool. In the scientific literature it is mainly described in two papers that each discuss the development of an IDE. Those IDEs are DrScheme, an IDE for the Lisp-dialect Scheme, and DrJava, an IDE for Java created for educational purposes. Both papers affirm the beneficial features of the REPL, they say the following about the REPL:

DrJava [2] The REPL offers the advantages of “alternative entry-points”, “quickly access the various components of a program without recompiling it, or otherwise modifying the program text” and “also serves as a flexible tool for debugging and testing programs and experimenting with new libraries”

DrScheme [8] “Interactivity is primarily used for program exploration, the pro- cess of evaluating expressions in the context of a program to determine its behaviour. Frequent program exploration during development saves large amounts of conven- tional debugging time. Programmers use interactive environments to test small components of their programs and determine where their programs go wrong. They also patch their programs with the REPL in order to test potential improvements or bug fixes by rebinding names at the top-level.”

The REPL was presumably introduced in the first version of Lisp and served as a way to start running the code. That, in contrast with the ‘main’-function which is used as a starting point in other implementations or languages; The single-point entry versus multi-point entry [19]. From the first moment it gave programmers access to all benefits of the REPL. In the following years the REPL kept playing an important role for many functional languages as its interface to the language. But also other type of languages followed. Some were added to the native development environment (e.g., Scala [27], Python [22]), and in some cases they are added by supporters (e.g., DrJava for Java [2], FireBug for Javascript [9]). Next to the basic read-eval-print part, the REPL offers almost always more features. The first Lisp implementations came with a feature which allowed a predefined symbol to recall a previous returned result. For the Lisp-REPL the symbols *, ** and *** were replaced for respectively the last, second-to, and third-to last result. Through the years the REPL has been extended with many other features like history management, syntax-highlighting, search and many more. DrScheme was the first who integrated the REPL in an IDE [8]. This offered an even tighter integration to the development process and opened the window to some new

6 features like debugging and hyperlinking of code. In Chapter 3 an overview is given of features encountered in the most popular REPLs.

2.2 The definition

There is some discussion about the definition of a REPL. Even though the abbreviation of REPL seems pretty clear cut, there is enough discussion about what to categorize as a REPL. There are two main discussion points. The first discussion originated when other languages next to Lisp also started getting shells with an eval-function which they also categorized as REPLs. Lisp supporters claimed that Lisp had the only real REPL due to its handling of data. The Lisp REPL is able to work with real data as input and output; Read reads an expression that can be parsed as an AST-tree1, true data, where other REPLs only read a string of data. Print prints an expression in the same way as Read accepts it, as an expression. Secondly, Lisp offers an eval function that can truly evaluate this data, where other evaluators often have to perform some tricks to get a proper interpreter to evaluate that data. According to critics, these two things give the Lisp-REPL a purity that other REPLs don’t have. The second discussion point originates from the use of the term REPL in daily language. Instead of using the term for a specific type of shell with a specific programming language, the term is being used in a wider context. Basically the definition is broadened to the literal meaning of the abbreviation, all software that asks for textual-input and returns an output will fall under this category. Consoles and shells meant for command-languages would then also be included in the definition. The fact that ‘REPL’ is not an established term is visible in the naming of the language tools that offer REPL functionality. Haskell [13], Lisp, DrJava and Scheme advertise with their tools as a REPL. Other languages however use the adjective ‘interactive’ or the name ‘shell’ (e.g., Interactive Ruby [26] and Python shell [23]). All tools however offer the same kind of Read-Eval-Print Loop functionality. When looking at the use of the word REPL in scientific research we notice it is only used in relation to programming-languages but not limited to Lisp. When we look at uses of the word REPL for software we encounter it only for programming languages. All these REPLs were working with languages that had the ability to keep state. Executing statements without a state to keep means only executing commands, this would significantly decrease the usability of a REPL. We believe that the term REPL was originally used to define the usage for a very specific type of data input/output and evaluation-technique, but it seems it has evolved beyond that definition. This is supported by the mentioned languages, as well as by the hundreds of scientific papers published since 20102. A quick grasp out of those papers show that the name REPL is used only in the scope of programming languages. To conclude, we have come to the following definition for a REPL: A textual-interface that loops through: reading a statement, evaluating this state- ment, and printing the output of that evaluation. This while keeping state and in a context exclusively for programming languages. Where under programming language, we do not include command-languages. It could follow that the Lisp-REPL is a REPL in its purest form, one that handles expressions as I/O and has

1An S-expression in the case of Lisp to be exact, which can be defined as an atom, or a list of an s-expression. 2Google Scholar returns approximately 313 results when searching for papers since 2010 with the text “read- eval-print-loop” [28].

7 a native eval-implementation. To complete the definition lets define every part of the REPL: • Read: Reads in statements in any textual form. Processes it to a data-format the evaluator can understand.

• Eval: Evaluates the statement, while keeping state. I.e., update the world-state with the given command and get the return-result.

• Print: Print the result from the command.

• Loop: Loop this Read-Eval-Print process until the user terminates. Every implementation of a REPL is of course free to perform additional actions in between these tasks to allow for other features.

2.3 Implementation

Implementing the essentials of a REPL seems simple, as illustrated for an imperative and a functional language in Listing 2.1 and 2.2, respectively.

Listing 2.1: Example code for a REPL Listing 2.2: Example code for a REPL in an imperative language in a functional language while(String command = readInput()) { (define (repl-loop) result = eval(command); (print (eval (read))) print(result); (repl-loop)) }

Apart from these basics, REPLs always offer more features such as history management, code completion, pretty-printing, etc. These features make the simple Read-Eval-Print process more complex. Other properties that make the implementation of a REPL more difficult are: • Evaluation should be handled in a separate thread. This will make sure the REPL doesn’t freeze during a long evaluation, or crashes if the program is caught in an infinite loop. That would harm the user experience of the REPL [18]. Next to this, any statement that is executed while another statement is being executed should not affect the current execution and has to wait its turn before being executed itself.

• Handling errors, the REPL is meant to be the environment where the user can safely experiment with the language and any created code. The REPL should be able to deal with any errors and return a proper error message to the user. This may require tight integration with the evaluator [10]. When an error occurs, the REPL should not crash, and the user should be able to continue putting new statements into the REPL. Besides adding features and implementation difficulties there are some fundamental difficul- ties that arise with the implementation of a REPL: • There is the problem of conflicting declarations. A recurring use case would be a user adding a function to the REPL, testing it, realizing it has a bug and then re-entering an improved version of that function. Not all languages allow redeclaration of a code- entity by default. Note that this is also largely dependent on the implementation of the

8 evaluator. Then there are the conflicting declarations between editor and REPL-entered code. Should the REPL display an error? Or should it use the newly declared function? And what happens if another function depends on that function? There is a high risk of unwanted behaviour.

• Secondly, there is the issue of direct interaction with the editor: the ability of the user to change a code file and directly have access to these changes via the REPL. This can give rise to conflicts since changing a file may influence the world-state in which the results of previous statements are saved. Another problem with this, is that removing a function in a file may not guarantee that it is also automatically deleted from the memory of the REPL. Developers can take different approaches to overcome this issue of live linking with the editor. Some REPL implementations accept the conflicting declarations, but most will force the user to start a new session in their REPL3.

Note that not all REPLs offer integration with an editor, so not all of them have to deal with these issues.

2.4 Related topics

2.4.1 Language workbenches In the introduction we shortly mentioned language workbenches. “Language workbenches [...] are tools that support the efficient definition, reuse and composition of languages and their IDEs” [7]. These workbenches help the language-oriented approach and give any newly devel- oped languages immediate access to fully featured IDEs. Erdweg et al. give a thorough overview of some existing language workbenches and their features. A major advantage of a language workbench is that since the language is also developed in this environment it has a lot of information available about the language. It has access to the concrete syntax as well as access to the abstract syntax tree which makes making IDE-features and tools easier.

2.4.2 Reusable REPLs There are several tools that allow the instantion of REPLs that support multiple languages. We will discuss two of them shortly here: repl.it A website offering a REPL that can be instantiated for 18 different languages [25]. For all these languages an interpreter is written in (or compiled to) Javascript. The REPL offers an editor, history management and the saving of sessions. The main purpose of the website is the exploration of new languages for users [24].

Emacs As they call it: “GNU Emacs is an extensible, customizable text editor — and more” [6]. The editor is open for extension and it has on more than one occasion been used to implement a REPL, examples are Lisp [30] and Clojure [4].

Although these REPLs are being reused across languages they lack the ability of true parametrization. REPL.it requires every language to redefine settings hard coded. Every Emacs REPL implementation is different since Emacs does not offer a REPL interface but a plug-in interface for editors. 3And some, like DrScheme, follow a batch-oriented approach when restarting. They will run all previous statements again, in a fresh state to prevent conflicts.

9 2.5 Summary

This chapter has provided an overview of the REPL as it has been discussed in the literature. DrJava and DrScheme are two IDEs who have integrated the REPL and have stated the benefits of it for the user. We have provided a definition for the term REPL, as well as for the indi- vidual parts: Read, Eval, Print and Loop. Next to this we have discussed two implementation challenges for the REPL as well as two fundamental problems that might arise when creating a REPL: conflicting declarations and interaction with the editor. In the context of language parametrization we have seen that there are several REPLs from which the code-base is shared to support different languages, but there is no true parametrization present. From the language workbenches we have seen that they form an ideal environment to support a language parametric REPL. New languages are being developed there providing the use-case scenario. Also a lot of information about the languages is already available.

10 Chapter 3

A REPL Comparison

To obtain a better understanding of REPLs an analysis is made of a variety of REPLs. The Read-Eval-Print part forms the basis for every REPL and does not really offer a way for com- paring REPLs. However, in practice REPLs contain many more features that try to make the life of the user easier. The REPLs used in daily practice turn out to contain many, distinctive features. This chapter provides an overview of a variety of REPLs and exactly which features they support. The chapter is concluded with an analysis of this overview.

3.1 Hands-on

Hands-on experience with various REPLs shows how all REPLs have the same bare essentials; they all allow the user to run statements and get an evaluated output, and all REPLs are stateful. Yet, there are many variations. Some REPLs are very bare (e.g., Hugs) while others are built around an extensive IDE (e.g., Matlab, DrJava). Some REPLs run in a shell (e.g., IRB, see Figure 1.1) while others are stand-alone applications. Some REPLs have split their window in a pane for history/output and a pane for the user entering statements (e.g., DreamPie, see Figure 3.1) and some come integrated into an editor (e.g., DrRacket, see Figure 3.2).

Figure 3.1: The DreamPie REPL. Noticeable is how the window is split in two parts. The above part displays the history and output and is uneditable, the lower is the place where the user can enter its statements.

11 Figure 3.2: The DrRacket REPL in its IDE. The window is split in two parts, the above part is the editor of the IDE while the lower part is the actual REPL. The example code shows how a function defined in the editor is available in the REPL.

Except for the visual differences and the integration the most notable experience from the hands-on are the implemented features. These features form a big part of the user-experience. They allow the user to enter statements faster, rerun previous statements, provide visual infor- mation, etc. Yet there seems to be a big division in the different kinds of features that REPLs implement.

3.2 Selection

To make a comparison of REPLs first a selection of REPLs had to be made. Secondly a standard way of comparing the REPLs is necessary, which will be done with the implemented features of each REPL.

3.2.1 REPL Selection The list of REPLs available is extensive and we tried out many of them. By using them and exploring their possibilities we got a good feeling for the current field of REPLs. A selection was made for further analysis based on three criteria: (1) The language or REPL is popular in daily use, (2) the REPL is a good implementation or (3) the REPL has one or more unique features. Appendix A lists all REPLs with their reason of choice and the tested version.

3.2.2 Feature Selection Although REPLs can have different looks and implementations, the unique user experience is in the additional features of the REPLs, therefore this will be used as a base for comparison. Every feature we encountered in our hands-on was written down. Afterwards a selection was made of these features. Features that were too simple or only added little usability-experience were omitted, examples are Undo and Redo, key-bindings, current line-highlighting and the ability to only copy code. A total of 24 features remained, and we put them in five categories depending on the way they are used.

Statement actions: An action that the user can take and that affects the written part of the current statement.

12 Statement features: A feature of the REPL that influences the way a statement is evaluated.

Information features: Features that provide additional (visual) information to the user.

Session actions: Actions which the user can take that influence the REPL itself or the session in which the user is working.

Code-file interactions: Features or actions that interact with source-files.

The total list of features is shown in Table 3.1 together with the definition that was used for each feature during the analysis. The next section will dive deeper into this definition by providing a short explanation as well as where possible an example. Some of the listed features in the table are related to features in editors and IDEs. There are features that are similar between both, e.g., syntax highlighting, statement completion and code referencing. There are some features that are shared between both but are for the context of the REPL a bit different, e.g., debugging and output referencing. Lastly there are also the features that are exclusive to the REPL e.g., history management, end of statement detection and saved outcome.

3.3 REPL feature overview

The selection of REPLs was put against the selected features for a comparison. This allows to obtain a proper overview of the features that are implemented by REPLs. Table 3.2 and 3.3 show the result of the analysis. Appendix B provides a small hands-on experience with each REPL.

13 Statement actions Manual multi-line Command that adds a new line to current input without executing it. Command history “allows the user to recall, edit and rerun previous commands” [33] History completion Complete the current statement with a matching one from history. Debug functionality The ability to debug a statement entered in the REPL. Code completion “(. . . ) involves the program predicting a word or phrase that the user wants to type in without the user actually typing it in completely” [31]. Statement features Multiple statements Allow multiple statements to be executed in a single run. Finished statement Continue to ask for more input when detecting that the current input is detection not (yet) a valid statement after pressing evaluate. Automatic import Automatic import, or the suggestion of an import, when an entity refers to a piece of code that is in a not yet imported file. Saved outcome The returned value from the user executed statement is automatically as- signed to a variable, or made available via a magic variable, so that it can be used in the next statement. Information features Brace matching “highlights matching sets of braces” [32] Syntax highlighting “display text (. . . ) in different colors and fonts according to the category of terms.” [29] Error reporting Report any errors that occur during evaluation to the user. Graphical output The REPL is able to display graphical output. Documentation Display code-documentation to the user. provider Output folding The ability for the user to fold and unfold long output with the purpose of not cluttering the screen with irrelevant information. Session actions Save session Save the current session in a file. Load session The possibility to continue a previously saved session. Import interface An interface that makes importing files easier. Search Searching through the printed output. Magic functions Written commands that are not part of the language but are used to control the REPL or language. Code-file interaction Editor integration The REPL is integrated with or in an editor for developing code. Direct interaction The user is able to interact with code from the editor or imported files without the need to restart the REPL. Code references Code-entities contain hyperlinks that navigate to the declaration of the selected code-entity. Output references Output or errors contain hyperlinks to the location where that error or output was generated.

Table 3.1: Overview of the features that existing REPLs implement. See Chapter 4 for complete definitions. 14 IPython REPL DrJava DrRacket IDLe DreamPie WinGHCi Hugs (notebook) Language Java Racket Python Python Python Haskell Haskell Multi line (manual) Command history Statement actions History completion Debug functionality Code completion Multiple Statements End of Statement Detection Statement features Automatic Import Saved outcome Brace Matching Syntax highlighting 15 Error reporting Information features Graphical output Documentation provider Output folding Save session Load Session Session actions Import interface Search Magic functions Editor integration Direct interaction Codefile interaction Hyperlinking of entities Hyperlinking of output or errors

Table 3.2: An overview of the most popular REPLs and their features. REPL Internet Explorer Firebug Chrome IRB Scala Matlab Language Javascript Javascript Javascript Ruby Scala Matlab Multi line (manual) Command history History completion Debug functionality Statement actions Code completion Multiple Statements End of Statement Detection Statement features Automatic Import Saved outcome Brace Matching Syntax highlighting Information features 16 Error reporting Graphical output Documentation provider Output folding Save session Session actions Load Session Import interface Search Magic functions Editor integration Direct interaction Codefile interactions Hyperlinking entities Hyperlinking of output or errors

Table 3.3: An overview of the most popular REPLs and their features (continued). For our analysis we checked for every feature Statement actions if it was implemented or not. The feature did not have to work in all cases, we simply wanted to know Multi-line (manual) 7 if the REPL supported this feature. For example, Command history 13 if finished-statement detection did not work in all History completion 5 cases but did in some we put it as a yes. The same Debugging 4 goes for code and output referencing, if there were Code completion 9 some parts that had a hyperlink we put the feature down as a yes. Table 3.4 summarizes the results and Statement features displays the total number of features implemented Multiple statements 10 for the analysed REPLs. Finished statement 8 detection 3.4 Results Automatic import 2 Saved outcome 7 The acquired data shows us that the field of REPLs is divided; there is not a single REPL that contains Information features all features, instead there is a wide variety of imple- Brace matching 6 mented features. There are only two features that Syntax highlighting 7 are implemented by all REPLs (Command history and Error reporting). Error reporting 13 From the hands-on experience obtained the dif- Graphical output 2 ferent use-cases for REPLs becomes visible. Next Documentation 3 to the main-reason of just running the code those provider are: Output folding 4

• DrJava really focusses on the teaching part Session actions by providing helpful messages and auto- Save session 5 importing files, making the step for beginner Load session 3 programmers as low as possible. Import interface 8 • Javascript REPLs focus on inspecting run- Search 5 ning code: debugging, investigating struc- Magic functions 8 tures, etc. This is also visible by the fact that Codefile interaction none of these REPLs have the option to save their session or import Javascript-files. Integrated editor 3 Direct interaction 1 • Matlab is built around a REPL and is about Code referencing 2 displaying output and calling functions, not about creating and testing new functions in it. Output referencing 3 In fact, Matlab does not even allow functions to be defined in the REPL, this can only be Table 3.4: Summary of the analysis of REPLs, showing the total amount of REPLs that im- done in source-files (this is actually a unique plemented the particular feature. property of the Matlab IDE, it is the only REPL that does not allow its full language in the REPL).

Besides the different use-cases, languages themselves also obtain different properties that affect the feasibility of features for REPLs. Python for example, contains a function that can return the documentation of any given entity. This makes it a lot easier to implement the documentation provider feature for a REPL than for a language that does not have this functionality. Another example is the referencing to error locations. In case of an error, a lot

17 of languages return the location of that error. However, when a language does not do that, it is difficult (if not impossible) to make a link/reference. Finally the REPL is also dependent on how the language is implemented. In an interpreted language it can for example be easier to implement the Saved outcome feature, since the world-state is easier accessible. The division of features between REPLs in different languagues can be explained by those two factors; A combination of the properties of the host language and the intended purpose of the REPL dictate the features that will add the most to the experienced usability and are also feasible, thus more likely to be implemented. Some other notable conditions that can be concluded from the results:

• All REPLs have the ability to display errors. This makes sense since this feedback forms an important part of the educational and testing purpose the REPL provides.

• Command History is probably considered the first and default feature of any REPL. That is also pointed out by the numbers, every REPL offers the functionality of a history.

• Integration with an IDE is a rare thing. This is notable since a lot of the advantages of a REPL come with its integration (e.g., direct testing, learning on the job). Although the same effect can be reached with the REPL running as a separate running program, better integration would result in an even higher advantage.

18 Chapter 4

Parametrization

In our domain analysis we have seen of which parts a REPL consists (Chapter 2) and we have seen that a big part of a fully functional REPL is formed by its features (Chapter 3). With this information we can start the analysis to see if we can parametrize the REPL to get a Language Parametric REPL (LP-REPL). The analysis is divided in two parts. First an analysis is made of the REPL without any features. Secondly all REPL features will be individually analyzed and discussed. This chapter concludes with a summary of the feature language-dependencies and the relations between features.

4.1 REPL Parametrization

To investigate how we can instantiate a REPL for different languages we first need to know what part of a REPL is language specific. To do this we need to analyse the REPL in pieces: Read: Reading in the characters that the user enters requires no specific language information. But to be able to evaluate the statement it needs to be parsed and this requires knowledge about the syntax. However, most eval-functions accept a string as input and will do the parsing themselves. Eval: Evaluating the given statement requires the knowledge to know how to evaluate the statement. In other words, we need an evaluator for the language. Print: Print the return value of the eval-function. If the eval-function always returns a printable object (e.g., a number or a String) there is no dependency. If the eval-function returns another type of object we need a function that converts this to some printable information. We can conclude from this that we need two things to make the basis REPL language-parametrized: An evaluator and a function that transforms the returned value of the eval-function to a print- able object. An interface for both of these functions is enough to parametrize the REPL.

4.2 Feature parametrization

The previous section showed that the basic Read-Eval-Print part can be parametrized for lan- guages. Next step is to see if we can parametrize its features and with that the full functionality of the REPL. To do so we will dive deeper into each feature and analyse which parts of it are language dependent. For every feature we will give its definition, a description, which depen- dencies it has and its relation towards other features. A dependency of a feature, is some language-specific piece of information or code that the LP-REPL needs to be able to make that feature work for that language.

19 4.2.1 Available information To see which dependencies a feature has it is best to first see what kind of information is available. A basic evaluator that can execute a single statement at a time is considered to be given since this is the core functionality of the REPL. Other pieces of language information can be:

Syntax: Or also known as concrete syntax. This provides information about the structure of entered statements.

Abstract Syntax Tree: Or also known as AST “[...] represents the hierarchical syntactic structure of the source program” [1]. This is an intermediate representation for the eval- uator.

State: The stored information that the evaluator maintains and is used when a new statement is being evaluated.

Although every language usually has its own parser, we don’t consider this to be a dependency since this can be extracted from the syntax and we want to focus on the lowest level of informa- tion required. Additionally, if the language-parametric REPL has control over its own parser it can add extra functionality that can be used for specific features. For argumentation reasons the examples and arguments are all given in an imperative pro- gramming style.

4.2.2 Statement Actions For each of the 24 features the definition, a description, its language-dependency and its relation toward other features is provided here. The dependency analysis of every feature is kept short and to the point, this is because the implementation of some of these features can be a research topic by itself. The purpose of this analysis is, by providing a theoretical implementation, to obtain the minimal language-specific information requirements. Although implementation techniques may vary, we believe the de- pendencies will hold since from the analysis it will show that the language-specific information cannot be generated and, hence, needs to be provided by the language-implementer.

Manual multi-line Example:

if(animal.hasTail() && animal.hasWisker() &&//new line here animal.getPaws() == 4 && animal.likesMice()) animal.say("miauw");

Definition: Command that adds a new line to current input without executing it.

Description: While normally the input would be evaluated on the press of , this command lets the REPL know that the current input should not be evaluated yet and adds a new line where the user can continue its statement. Often the key-combination + is used. This allows the user to split his statement over multiple lines giving him a better overview (see example) or to enter and execute multiple statements at once (see: Multiple statements).

20 Dependency: Every language has its rules about when it is valid to intervene a statement with a new line. However, it is not part of the feature to detect this, that is the call of the user. The feature does not have any language dependencies.

Related to: This feature is necessary for some situations of Multiple statements. This feature is made superfluous for single statements with the feature Finished statement detection. If the users enters an incomplete statement that feature will make sure that the REPL continues to ask for input on the new line.

Command History Definition: “allows the user to recall, edit and rerun previous commands” [33]

Description: Often the and keys are used to go through the executed commands by replacing the current input by that command. This saves the user the time of rewriting complex statements again.

Dependency: This feature has no dependencies.

Related to: This feature needs access to the history of the current session.

History completion Definition: Complete the current statement with a matching one from history.

Description: For testing a user often wants to run the same statement over and over. This feature allows the user to only type the first characters and let it be replaced by a previous run statement. Often (Tab) is used as a trigger. Searching history can in some cases also be seen as history completion.

Dependency: History completion is a simple textual match of the already entered statement versus the ones in the history. This is independent from whether the statement is valid or not. There are no dependencies.

Related to: This feature uses the history, it therefore depends on this.

Debug functionality Definition: The ability to debug a statement entered in the REPL.

Description: Being able to debug a statement entered in the REPL brings extra power to the testing purpose of the REPL; besides quickly testing code with different parameters users can now also inspect the code they run. Although a debugger works best with an integrated editor, some REPLs offer the ability to control the debugger using magic functions, i.e., designated textual commands interpreted by the REPL itself.

Dependency: There are different ways a debugger can be implemented, i.e., it can be inte- grated with the evaluator and can be called with a separate parameter, or it can be a completely different entity that is called via a hook when the evaluator reaches a break- point. For argument sake we presume that the debugger is a blackbox that conforms to the evaluator’s interface. In that case the minimal dependency for this feature is access

21 to the language’s debugger. Special control buttons like pause, step, run, put breakpoint, etc. should be added to the interface of the LP-REPL’s evaluator, to control the debugger. For usability it is best that the REPL is integrated with an editor so that the debugger can be better controlled (e.g., putting breakpoints) and the different states can be better visualised (e.g., current execution line, state of variables).

Related to: This feature benefits from an integrated editor.

Code completion Definition: “(. . . ) involves the program predicting a word or phrase that the user wants to type in without the user actually typing it in completely” [31]. Also known as auto-complete, word-completion or word-prediction.

Description: Often Ctrl + is used as a trigger, displaying a list of possibilities or, in the case when there is only one possibility, completing the word. This feature helps the user by speeding up interaction, minimizing typing errors and remind the user of the correct vocabulary of the language [14].

Dependency: In “Code Completion Framework for Rascal Developers” [3] code-completion is separated in three categories:

• Syntax completion: “. . . refers to the completion of syntactic elements of a language such as keywords and layout”. To enable this part these syntactic elements need to be known to the LP-REPL. This information can be obtained from the syntax of the language. • Template completion: “. . . is a mechanism for language users to quickly insert pieces of code which are often used in certain situations”. This is the mechanism that places boilerplate code (e.g., the body of a function or class). These templates can be given, or the user could create custom ones. To enable the default templates for the LP-REPL the templates need to be given as a parameter. • Semantic completion: “. . . analyses the current structure of the source code and attempts to extract dynamic, semantic facts from it”. This is the most-known and most-useful form of code-completion, the mechanism that completes text with users own defined entities. This features needs several forms of analysis: – Name analysis recovers all declared entities. – Type analysis subtracts any type information from the declaration of these en- tities. – Scope analysis checks the visibility of these entities. – Additional analysis might be necessary for language specific constructs (e.g., public/private constructs for visibility). It is obvious that this complete analysis cannot be generated by the LP-REPL since the thorough analysis required is language-specific. Hence, to enable this feature the language-designer needs to implement an analyser for his language. This analyser can then provide its completion-suggestions to the LP-REPL. The LP-REPL should be provided with an interface for this ‘content provider’ function.

There are also other ways to implement code completion in a language parametrized way (see Bierlee’s thesis [3] for an analysis on how the language workbenches XText

22 and Spoofax work). However, these approaches have limitations in the analysis of the language’s structure, and hence in its suggestions. The discussed approach has no limita- tions. The determination of the most likely completions is another topic by itself. Since this will impact our LP-REPL minimally we will leave it out here.

4.2.3 Statement Features Multiple statements Example:

int x = 1; int y = 2;

void f() { print("test"); } f();

Definition: Allow multiple statements to be executed in a single run.

Description: A user normally runs statement by statement. However, allowing multiple state- ments gives the user some additional advantages. For example a user might want to create a function and immediately call it to test it (see the second example). Otherwise a user might want to copy and paste code consisting of multiple statements.

Dependency: A basic evaluator might only be able to handle a single statement at a time. To circumvent this problem the input needs to be parsed to extract any possible statements from it. These statements can in turn be executed one by one. To parse the input we need to know the syntax of the language.

Related to: The default behaviour for a REPL when the user presses is to execute the given statement, that is considering the statement is finished (see Finished Statement Detection). Entering a second statement on a new line without execution would be im- possible if not for a way to enter a new line, making this feature dependent on Manual multi-line. For entering multiple statements on a single line this relation is not existent.

Finished Statement Detection Example:

if(foo=="bar")//press Return here //continue to ask for input here

After entering an if-condition and pressing the REPL will not evaluate since it can detect that the statement is not finished yet, instead it will continue to ask for more input.

Definition: Continue to ask for more input when detecting that the current input is not (yet) a valid statement after pressing evaluate.

23 Description: Most REPLs will execute the entered statement after pressing , this feature however allows the user to enter more code. This has the advantage that the user can quickly type his code as he is used to, with for example an if-statement on the first line and the concluding statement on the next. This feature is often implemented such that when it is triggered, it adds a new line to the input and let the user continue there. However, the REPL could also just continue to ask for more input on the same line.

Dependency: There are two possible ways to implement this:

• Parse the input using the languages syntax and detect whether this is a finished statement or not. This makes the feature dependent of the language syntax. • A specific function could perform simple checks like checking whether the number of open braces match the number of closed braces. Downside of this is that it can only do simple checks and can never replace a syntax. Advantage is that it is easy to implement. The dependency would be that the LP-REPL needs access to this function.

Related to: This feature affects some use cases of Manual multi-line.

Automatic Import Definition: Automatic import, or the suggestion of an import, when an entity refers to a piece of code that is in a not yet imported file.

Description: This feature helps when the user forgets to import the related file. Especially useful for starting coders who might forget about this aspect of coding. This feature can only be implemented for languages that support importing.

Dependency: To enable this feature it is necessary to be able to import files. The dependencies for this are the same as for the feature Import interface. Additionally the LP-REPL needs to be aware of which files contain which entities. That goes for any library files as well as any files in the working space. The LP-REPL needs therefore access to the parser to read and understand other files to suggest a file to import. This requires the language’s syntax but also a mechanism for the LP-REPL to understand where entities are declared. To be able to suggest imports about code in library files the LP-REPL also needs access to these.

Related to: This feature is only possible if importing files is possible.

Saved outcome Example:

>> Math.sqrt(2) 1.4142135623 >> ans * ans 2

24 Definition: The returned value from the user executed statement is automatically assigned to a variable, or made available via a magic variable, so that it can be used in the next statement.

Description: By making the last result available the user can build result upon result. Keeping him in the flow and avoiding long nested statements. An often used identifier is ans. Sometimes REPLs don’t only make the last result available, but save every previous result; they use for example the identifiers res0, res1, res2, etc.

Dependency: There are several possible ways of implementing this feature, each with their own limitations and dependencies:

• Textual substitution: From an evaluated statement, save its returned value in a string the way it is printed. Before evaluating a new statement replace the used term (e.g., ans) by the saved text. This is a magic variable since it is not part of the language. A limitation of this approach is that this will only work for variables that can be evaluated the way they are printed. For a language like Java this would mean that this would work for primitives like integers and strings but not for objects. • Assignment: Before evaluating the statement put an assign statement at the be- ginning of the statement. For example, the user enters 4+4, behind the screen the LP-REPL changes it to ans = 4+4 and continues to evaluate it. A dependency of this approach is that the LP-REPL needs to know how an assignment is done in the language. A limitation would be that this would only work for dynamically typed languages since it is unknown what the type of the returned value would be. An implementation-difficulty that will arise is that not every statement can be rewrit- ten to an assignment meaning that the LP-REPL should be able to detect which statements are, or handle these errors in the background. • Access the variable-environment: To keep state, every evaluator holds its own list of assigned variables. After evaluation, a value — and in statically-typed languages a type — is returned. This can then be added to the variable-environment so that it is available to the user in the next statement. A limitation is that not every evaluator gives access to this variable-environment making this approach impossible.

One general dependency is the name of the variable in which is saved. This name cannot be a keyword or contain any symbols that are not allowed in the language. For example some languages don’t allow numbers in variable names and the character ‘* ’ as used in Lisp might also not be available.

4.2.4 Information Features Brace Matching Definition: “highlights matching sets of braces” [32]

Description: By placing the cursor next to a brace, the matching brace will highlight allowing the user to maintain the overview of complex algorithms.

Dependency: Troubling cases:

print("((");//ignore the brackets between quotation marks.

25 Tuple t = <1, 4>;//Some languages use angle brackets fora certain data type if(i < 4) print (i);//‘Less than’ should not be confused here.

The above defined cases show that bracket matching is not as simple as counting brackets. Instead, the syntax should be known so that every statement can be parsed. To determine which brackets should be matched and which not, additional information is required. One way of providing this information could be with a special keyword in the syntax.

Syntax highlighting Example:

String s ="text";//Keywords, types and even comments geta different color

Definition: “display text (. . . ) in different colors and fonts according to the category of terms.” [29]

Description: By giving different types of text different colors the user maintains a better overview of the displayed code.

Dependency: To determine what the different types of text are, the text needs to be parsed. To make that possible the parser needs to know the syntax of the language. To categorize the groups of text some form of category/color mapping needs to be existent. Since grammars often use the same categories, a default mapping might be considered. However, those categories might not always be the best fit for languages. This is a problem since we want to deal with all possible languages, not just imperative programming styles but also everything in the scope of a DSL. Therefore a separate way of mapping is needed. This gives the additional advantage that languages have their own ‘style’.

Error reporting Definition: Report any errors that occur during evaluation to the user.

Description: By reporting errors the user knows when something is wrong and why. This feedback is essential if the REPL wants to uphold its purpose of testing and teaching.

Dependency: Errors occur during evaluation, if the evaluator returns these to the LP-REPL, then it can display them to the user. That makes this feature dependent on the evaluator, where the evaluator needs to return the error-messages according to a predefined interface. The errors returned from the evaluator can be written in a specific format by the evaluator, these might need to be formatted. In a REPL one might for example chose to display some errors a bit different, for example point to the place where a syntax error occurred in the statement, or don’t print a stack-trace. Although not essential for error-reporting, it is more user-friendly. To support this the LP-REPL needs to have access to a function that formats the error-messages.

Graphical Output Definition: The REPL is able to display graphical output.

26 Description: Running the code new Circle(RED, 10) would display a red circle of size 10 in the REPL or in a linked window. This feature allows the user to instantaneously see the result of any work done with graphical objects.

Dependency: To implement this the REPL should check the type of the returned value from the evaluator. If it is of a type that is drawable the image should be rendered and displayed in the REPL. This makes that the REPL should know what types are drawable and be able to call the method to make them draw something. If the object being drawn should be drawn in the REPL they should make use of the same underlying library. This is a demand that won’t work for many languages. Another solution would be to display the graphical output in a separate window. This would require the REPL to be able to create such a window and add the object to it.

Documentation-provider Definition: Display code-documentation to the user.

Description: Some languages have the possibility of providing documentation to a function in the code, providing information to the possible user what the function does and how it should be used. Examples are Javadoc and the Python Documentation. This documen- tation available for entities like variables, functions and classes. There are two ways in which this functionality can be implemented:

• Internally: The REPL itself shows in a message box of some form the documenta- tion. This method has the possible advantage of showing the documentation without disturbing the work-flow of the user. • Externally: Showing the documentation in a separate panel or window. For Example in DrRacket, when the cursor is placed on a function name, the user can press F1 which will open a website with the function description.

Dependency: To be able to provide the documentation the REPL must have a way to obtain the documentation. For library functions the documentation could be provided via some sort of documentation file. To obtain the documentation for self defined code or imported code, two ways are available:

• If the language has built-in support for obtaining this documentation than this has the preference to be used. For example, Python has the built-in function help() that returns a String with the documentation of the given parameter. In this case the documentation can be provided with the help of the evaluator. • If the language does not have the first type of support, it would be necessary to parse the code and manually obtain the documentation. This requires information the syntax as a dependency, as well as a way to know the scope (to locate the correct reference).

Besides this, a certain formatter might be required depending on in which style the doc- umentation is written. The documentation could for example contain keywords or links which need to be formatted to be displayed properly.

27 Output folding Definition: The ability for the user to fold and unfold long output with the purpose of not cluttering the screen with irrelevant information.

Description: The default setting of this feature is to fold long output so that the user is not disturbed with irrelevant data. If the user wants to he can unfold the output (often by double- or right-clicking) and fold it again when he chooses. This feature can, next to output, also be applied for long results.

Dependency: This feature has no dependencies.

4.2.5 Session Actions Save session Definition: Save the current session in a file.

Description: Allows the users to have a look at their work another time, or to send it to somebody else. This is especially useful for help with debugging. Together with Load Session this feature offers more advantages.

Dependency: This feature has no dependencies.

Related to: To save the session this features needs to know all the statements that have been entered and executed by the user. This feature therefore needs access to the history.

Load session Definition: The possibility to continue a previously saved session.

Description: Allow the user to continue a previous session, this can be a session of the user himself or somebody else his session.

Dependency: This feature has no dependencies.

Related to: This feature is only possible if there is a Save session-feature. This feature affects the history since the loaded statements should be added to it.

Import-interface Definition: An interface that makes importing files easier.

Description: Writing an import statement is error-prone and time-consuming. Secondly, an evaluator often can’t import every file but is limited to importing files from a specific group of files or objects. This is often named the workspace. This feature makes importing files easier and faster. Making it in turn easier for the user to test his/her own written code without pasting large chunks of code in the REPL.

Dependency: Importing is a job for the evaluator; To read a file, parse it, and evaluate it. Manually reading a file and evaluating every line might not work due to references to other libraries or files. These references are also the reason language often contain something

28 like a workspace, to keep the references limited. Therefore this feature is dependent on the evaluator with an interface to import files. Next, for languages which work with a workspace, the LP-REPL needs to be able to set the workspace-directory.

Search Definition: Searching through the printed output.

Description: Searching through any printed output can be useful feature to a user. Especially when the user is running a longer session or when a lot of output is printed.

Dependency: This feature only needs access to any printed information, this is not language specific and therefore there are no dependencies.

Magic functions Example:

:help//displays the help message of the REPL, not part of the language

exit()//closes the REPL, not part of the language

Definition: Written commands that are not part of the language but are used to control the REPL or language.

Description: This feature allows users to control the REPL without switching to another input device (mouse) or interface (graphical). For REPLs in a terminal this is the only way to control the REPL. To make this feature work the input should be parsed/scanned first, and only if it is not a magic function be entered in the evaluator. If it is, the REPL should handle the functionality itself instead of the evaluator. Many REPLs offer the commands exit and clear.

Dependency: Checking statements can be done before the eval function is called, therefore there are no dependencies. However, the functions might conflict with functions or iden- tifiers of the running language. This would require to parametrize the names of the magic functions for every language. This is not a desirable situation, since it would eliminate the possibility of a unified way of controlling the LP-REPL.

4.2.6 Code-file Actions Integrated editor Definition: The REPL is integrated with or in an editor for developing code.

Description: An example of this is an IDE with an integrated REPL, in contrast to a stand- alone REPL that has no possibility to edit files. The integration of an editor offers many advantages in usability. The user doesn’t have to switch windows to test code, tools as the debugger can be shared and better controlled, hyperlinking to code is possible, etc. Also on an implementation level it offers the advantages of sharing resources and tools, e.g., parser, syntax highlighter, debugger, etc.

29 Dependency: There are two ways to add an editor, either create one or integrate with an existing one. Either way, if the editor is kept as simple as a text-editor, that is without any features, there would be no language-specific information required. To extend the editor with features (e.g., syntax highlighting, code-completion, etc.), the information can be used from the REPLs own features.

Related to: This feature can make the following features easier to use: Debugger, Code refer- encing, Output referencing. Other features that can be shared with the editor are: Brace-matching, Syntax highlighting, Code completion, Documentation-provider and Error reporting.

Direct interaction Definition: The user is able to interact with code from the editor or imported files without the need to restart the REPL.

Description: This feature allows the user to fix a bug in an imported file and then continue its session without the need to restart the REPL and lose any defined entities. This feature was in DrJava [2] and DrScheme [8] discussed as a difficult problem due to the conflict that arises with updating pointers to code entities. Secondly, it was also considered error-prone since a user might confuse which function he would be calling.

Dependency: Implementing this feature for any possible language would mean keeping track of all code declarations and every change to a file. Even if that can be done properly it would still result in confusing situations where it would be hard to decide which code is valid, the latest from the REPL or the latest changed version from the imported file, or is the original value still valid? In some cases it could be a useful feature; for example for languages that do not keep state, or if state-changes could by some form or restriction only be made in the REPL and not in the editor. This however would not fit in the idea of a unifying REPL for all languages and therefore fall outside the scope of this research.

Related to: To allow direct interaction with a separate file, these files need to be able to be imported. The feature could however also be implemented in a way that the file that is currently open in the integrated editor is available in the REPL. Direct interaction needs therefore either one of these features, or both.

Output references Definition: Output or errors contain hyperlinks to the location where that error or output was generated.

Description: This feature allows quick navigation to the source of an error or output, saving the user the time and trouble of locating it himself. This feature is different than Code referencing since it is limited to specific functions. Some REPLs offer this feature in parts of their code, for example the output of a logger is linked to the position of where that logger is called.

Dependency: In most languages errors contain the location of its source. To support error- hyperlinking any returned errors should be comprehensible for the LP-REPL. With a

30 specific function that formats these errors, similar to the feature error reporting, the location can be given to the LP-REPL. This only works if the error contains the location. The LP-REPL can then use this to create a link from the error and open the location when clicked. To support output, and errors that don’t contain a location, a more complicated approach is necessary. Since it is impossible to trace back the origin of the call, this feature can only be supported if the output-function itself provides its location. That might require adjustments to the language itself. The location itself can point to two locations. Either the location is in an imported file, which requires the LP-REPL to let the editor open the file. For use with an external editor a format for communication should be established. The location could also point to a place in the REPL, i.e., an entity declared in the REPL, where the LP-REPL needs to address the users attention to that line.

Related to: This feature is similar to Code referencing. If an integrated editor is available, this will be used, otherwise an external editor will be used.

Code references Definition: Code-entities contain hyperlinks that navigate to the declaration of the selected code-entity.

Description: This feature allows quick navigation for user-declared entities. The feature is often implemented in editors where Ctrl +CLICK is often used as trigger to navigate. Except for user-declared entities it can also be possible to navigate to library functions.

Dependency: Finding the location of a declared entity is complicated. There are several ways in which this can be implemented depending on the language constructs and if and where it might hold locations to code-entities. An approach that would work for is holding locations in the AST-tree as a form of source- map. This is a rather brute approach since it requires parsing all imported files as well as the REPL-history by the LP-REPL. However, it would work for all languages. Since from the syntax itself it cannot be detected what a declaration is, this information should somehow be passed in the syntax, or in another way given to the LP-REPL. Another approach could be to let the evaluator return the location of all declared entities. This requires an evaluator that supports this. The evaluator could via a special interface let the LP-REPL know the locations of these entities. To visualise the location in the LP-REPL or editor the same dependencies hold as men- tioned in Output referencing.

Related to: This feature is similar to Output referencing. If an integrated editor is available, this will be used, otherwise an external editor will be used.

4.3 Dependency Conclusion

From the analysed 24 features two do not fit in the context of a language parametric REPL: Magic functions and Direct interaction. These features do not work in a unifying way for different languages.

31 We can divide the remaining features in four groups, depending on the amount of information they require:

Independent features: These are features that could be implemented without any need for language specific information besides the evaluator. Meaning they can be implemented in the LP-REPL and work for any language. We call these features independent because they have no dependencies for language specific information.

A: These are the features that are syntax-dependent. When the syntax is given, and the LP- REPL can parse the language these features work. They have no need for additional information.

B: Features that require a specific inferface to the evaluator or require some form of additional information fall under this category.

C: Any features that require more specific information fall under this category, examples are features that need access to the AST-tree or that require a specific function to be built for every language.

Table 4.1 gives an overview of the features and to which category they belong.

Category Information Implementable features Independent - Manual multi-line, Command history, His- tory completion, Output folding, Save ses- sion, Load session, Search, Integrated editor A Syntax Multiple statements, Finished statement de- tection, B Syntax, simple interfaces Bracket matchting, Syntax highlighting, Er- or information ror reporting, Import interface, Output ref- erences C Specific function, access to Debug-interface, Code-completion, Auto- the AST-tree matic Import, Saved outcome, Graphical Output, Documentation provider, Code ref- erences.

Table 4.1: Categorization of the features depending on the amount of language-specific infor- mation required to implement.

Note that the created division in this analysis is dependent on the way of implementation. For example, bracket matching could also be implemented with a funtion instead of with the syntax. In that case it would belong to category C. All features are categorized according to the analysis in the previous subsection, where, in the case of multiple possibilities, the implementation with the lowest requirement for information was preferred. To conclude: When building the features for a language-parametric REPL, there are eight features that work for any language without the need of any language-specific information. There are two features that require the syntax of the language, so that LP-REPL can parse the given statements and provide these features with a parsed statement. Five features require some form of a interface with the evaluator to communicate or some specific information. Seven features are the hardest to implement, for every language a specific function should be written.

32 Figure 4.1: This diagram shows the features and their relation towards other features. The diagram shows which features are required for other features and which features affect each other. Features that have no relation towards other features are omitted from the diagram.

4.4 Feature Relations Conclusion

Figure 4.1 shows a diagram with the result of the analysis about the relations between the features. This information can be useful when creating an LP-REPL to see where features come together and code should interact. Most features are not in the diagram meaning that these can be implemented independently. History and the editor have the most relations suggesting that they have a high impact on the implementation for other features.

33 Chapter 5

A LP-REPL Implementation

To put weight to our theoretical approach we will present the implementation of a language parametric REPL as a proof of concept in this chapter. This will combine the already learned information from the domain analysis (Chapter 2 and 3) and the theoretical parametrization of a REPL (Chapter 4), and put it to practice. This proof of concept can instantiate and run several languages and it implements several of the analyzed REPL features. We will start with an introduction to the environment chosen to implement the LP-REPL. The following section discusses the result of the implementation: the metrics and architecture of the code followed by the implementation difficulties for the REPL and the features. In Section 5.3 the results of the interfaces used to parametrize the different languages are discussed. This chapter ends with a discussion about the difficulties of parametrization.

5.1 Environment set-up

To build the language-parametric REPL we used the Java Swing library. To help obtain the necessary language-information we used the language workbench Rascal.

5.1.1 Java As the language for implementation we chose Java. Java’s Swing library was used for the im- plementation of the GUI. The Java JTextPane [20] class, part of Swing, offers all the wanted functionality for a text field, and more, necessary to build a REPL. It handles all keyboard events, offers full cursor-functionality and has an undo/redo ability. Besides that, it also pos- sesses interfaces that allows implemention of additional functionality necessary to build a REPL:

• The ability to prevent the user from editing text. This is necessary since no changes can be made on the executed statements and on any of the output.

• Allow to insert and remove text on the spot. Among others required to implement Com- mand history, so that on the assigned key-event the current statement is removed and one from history is inserted.

• Allow custom fonts and colors; necessary to implement Syntax Highlighting.

5.1.2 Rascal From the feature analysis we know that a lot of information about a language is required. We have also seen that language workbenches already offer a lot of information about languages. An

34 integration would make this information available to the LP-REPL. Besides language informa- tion, integration with an existing workbench offers the following benefits for the implementation of an LP-REPL:

• Make use of syntax and parser mechanisms.

• Make use of any IDE tool integration the workbench already offers.

As an added benefit, the LP-REPL could also be integrated in the workbench making it instan- taneously available for both new languages and old languages developed in the workbench. As a language workbench we used Rascal [17]. Next to a language workbench, Rascal is also a programming language that allows the building of new languages. Rascal is built in Java and uses Eclipse [5] as an IDE. Syntax for new languages can be defined in Rascal and the workbench uses this information to offer some IDE-features dedicated to the developed language. The workbench already offers language support for [7]:

• Syntax highlighting

• Code folding

• Outlining

• Semantic completion

• Reference resolution

• Code formatting

Rascal’s syntax will be used as a format to parametrize the syntax of different languages. This can be used to instantiate the particular features.

5.1.3 Evaluator integration To test the language parametrization of the REPL, we will test it on three different kind of evaluators:

• Command Interface: The ability to interact with an existing REPL that would normally run in a terminal/shell.

• Interface for a Java Evaluator: Any evaluator written in Java, or that can be called via Java, can be used in this interface.

• Interface for a Rascal language: Any language written in Rascal can be used via this interface.

5.2 The Implementation

The implementation resulted in a fully functional language parametrized REPL. Figure 5.1 shows how the LP-REPL is running Lisp and Figure 5.2 shows the LP-REPL running Rascal (both of the category Java evaluators). The screen-shots show several executed statements and one can already see several features like error displaying, output folding and syntax highlighting. Next to the discussed features in Chapter 4, the REPL offers some additional features like Current-line highlighting to make the life of the user easier.

35 The complete code is available at Github [12]. To run the project with syntax highlighting the source to the Rascal parser is necessary, this code is also available on Github, see: [11].

Figure 5.1: The LP-REPL running Lisp.

Figure 5.2: The LP-REPL running Rascal.

Next to the Rascal Parser the project uses some additional libraries to load the evaluators of the languages that could be run with the LP-REPL. No other libraries were used. In the next two subsections we will take a look at the size of the project and its architecture.

36 5.2.1 Metrics Table 5.1 shows the lines of code (LoC)1 and number of classes of the LP-REPL implementation. We separated the metrics in two categories, metrics from the LP-REPL implementation itself and metrics from the package that contained all the language information. The metrics do not include any Rascal Code, nor do they contain the external libraries.

LP-REPL Language settings LoC 2928 843 Nr of classes 67 15

Table 5.1: Metrics of the proof-of-concept LP-REPL implementation.

The LP-REPL implementation itself consists of approximately 3000 lines of code. This is just an indication of what is necessary to build a language parametric REPL. On the one hand not all features are implemented (see Table 5.2). On the other hand we tried to make the LP- REPL as user-friendly as possible which resulted in additional lines of code. This was because we added additional features like current line highlighting and key-events. The language specific settings contain over 800 lines of code. This seems a lot but a closer looks shows that most of these lines are used to implement the interfaces. Every language is defined in two special classes (a general language settings file, and an evaluator) that both implement an interface. These interfaces contain many methods and this is the cause for the big amount of lines of code. Subsection 5.3 shows how the language interface exactly looks like and provides a better insight.

5.2.2 Architecture The implementation of the LP-REPL was done in levels to keep the different functionalities separated. Figure 5.3 shows an UML class diagram with the most important classes of the LP-REPL implementation. The four REPL classes are explained here:

AbstractREPLPanel: This class is an extension of JPanel and places the JTextPane in it. This pane is used as the REPL interface for the user, the place where statements are entered and output is being displayed. The AbstractREPLPanel provides an API for any REPL actions on this pane. There is built-in support for different styles, allowing for syntax highlighting and output in different colors.

BasicREPLPanel: This class extends the AbstractREPLPanel and implements the most ba- sic REPL functionality: it manages execution, the output and error stream and it also manages the history.

ExtendedREPLPanel: This class extends the BasicREPLPanel with all other implemented features like importing, search and the loading and saving of a session.

StandAloneREPL: This class provides the GUI, and the frame in which the ExtendedREPLPanel is placed. It offers no new REPL functionality, it only provides the interface for the in- teraction with the user. 1Counted all new lines except blank lines, JavaDoc and comments.

37 Figure 5.3: UML Class Diagram displaying the most important classes of the LP-REPL imple- mentation.

5.2.3 Basic REPL In Chapter 2 two implementation difficulties were discussed that followed from a study of literature: the ability to run the evaluation in a separate thread to prevent the GUI from freezing, and the handling of errors.

• Making the LP-REPL multi-threaded went without problems. The evaluator runs in a separate thread so that the user interface stays active. Notably, the GUI did freeze when there was a continuous stream of output tasks. Folding the output and results (very long lists) actually helped in keeping the GUI responsive.

• To handle the errors, in our special case of supporting multiple evaluators, we designed an interface to which the evaluator should hold. This transfers any risks to the language- implementer, who has to make sure that any errors occurring during evaluation are prop- erly returned via the interface.

From the fundamental problems discussed in Chapter 2:

• The problem of conflicting declarations is a very real problem for any REPL. However, this is not dealt with in our proof-of-concept since this has to be dealt with via the evaluation process. Since the evaluators are given as a parameter, the LP-REPL has no influence upon the declarations.

• Direct interaction is not supported in our proof-of-concept, meaning that if the user would make a change in a codefile he will have to reload the file and the session of the evaluator

38 will be reset. To make our proof-of-concept a bit more user-friendly the LP-REPL offers a button that lets the evaluator start with a new slate and reloads all loaded imports.

To support printing output, the evaluators are provided with an output stream. The eval- uators have to make sure that any output generated with their language’s print-statement is send to this stream so that it will be properly displayed in the LP-REPL. An additional feature implemented in the LP-REPL was the ability to abort a running execution. In case the user ends up in an endless loop, aborting the statement is the only way the user can save his session.

5.2.4 Features

Analysed Dependency Category Feature LP-REPL REPLs category Multi line (manual) 7 Independent  Command history 13 Independent  Statement History completion 5 Independent  actions Debug functionality 4 C  Code completion 9 C  Multiple Statements 10 A  Statement Finished Statement Detection 8 A  features Automatic Import 2 C  Saved outcome 7 C  Brace Matching 6 B  Syntax highlighting 7 B  Information Error reporting 13 B  features Graphical output 2 C  Documentation 3 C  Output folding 4 Independent  Save session 5 Independent  Load Session 3 Independent  Session Import interface 8 B  actions Search 5 Independent  Magic functions 8 -  Integrated editor 3 Independent  Codefile Direct interaction 1 -  interaction Code referencing 2 C  Output referencing 3 B 

Table 5.2: Overview of the implemented features of the proof-of-concept compared to the total amount of implemented features of the analysed REPLs in Chapter 3.

Table 5.2 provides an overview of the implemented features in our proof-of-concept and puts them in contrast to the number of implemented features obtained from our hands-on experience with REPLs in Chapter 3. Remember that we have independent features (require no information), features in category A (need a syntax definition), category B (need small amount of information) and category C (need specifically designed integration).

39 Some of the features are quite complex to implement and due to time and resource constraints the amount of features implemented is limited. A total of eleven out of the 24 features were implemented. No features from category C were implemented due to their complexity.

5.3 Resulting interfaces

To test the LP-REPL with different type of evaluators three different interfaces were created for different types of evaluators. These interfaces were already mentioned in subsection 5.1.3: the command interface (interact with REPLs normally running in the command prompt), the Java interface (interact with evaluators written in Java) and the Rascal DSL interface (interact with DSLs written in Rascal). We will take a closer look at these three interfaces in this section. To test the interfaces, and with that our proof of concept, we tested every type of evaluator for different languages. Table 5.3 shows which languages were parametrized and tested.

Type of interface Implemented languages Command Interface Ruby, Python Java Evaluator Rascal, Python (Jython [16]), Lisp (Jatha [15]) Rascal DSL Evaluator Pico

Table 5.3: Overview of the implemented languages of the proof-of-concept.

The LP-REPL was designed to take care of three types of output:

OutputStream: Any output written by for example print. The LP-REPL prints these mes- sages in green.

ErrorStream: Any errors that occurs are printed via this stream. The LP-REPL displays these errors in the panel in a red color so that the user knows it is an error message.

Return statement: The returned result of a statement, the LP-REPL displays this in black.

The colors help the user to quickly distinct between the different types of output (as is clearly visible in Figures 5.1 and 5.2). It became however quickly apparent that this division could not be maintained for all evaluators.

5.3.1 Command interface The command interfaces allow to run REPLs that are normally run in the command prompt.

The interface To load a REPL that normally runs in a command prompt the user should first make sure it is installed on the system, and is available via the command prompt. The necessary information to load the REPL in the LP-REPL can be provided via an XML-file. Listing 5.1 gives an example of how Python is being loaded into the LP-REPL using the command interface. The name and command parameters are the only mandatory tags. The name is used to identify the language and to display it to the user. The command is the command the user would give in its terminal to start the REPL. The LP-REPL needs no additional information to function.

40 Listing 5.1: Example of the parametrization of Python using the command-interface that comes default with Python Python python python-syntax.rsc python-syntax-categories.xml

The syntax tag is used to define where the rascal syntax file is located. This is used to make the features from category A possible. The syntaxCategories is used to make the Syntax highlighting feature possible, which is a feature from the category B. This tag contains the location of an XML-file in which the color mapping is defined.

Results When implementing the command evaluators several troubling situations arose that required language specific information.

General: • Calling the REPLs is done via Java’s ProcessBuilder class [21] which allows an output and an errorstream to be set. This interface does not allow to return any other data, meaning that the LP-REPL can not obtain the results, instead they are often automatically printed by the REPL via the outputstream. • Unable to detect when an execution is finished. Since the statement is sent to another process that runs parallel, it is not known to the LP-REPL when the evaluation is ready. It is also unknown if there is going to be any output, and if there is, how much output there will be. • The tested REPLs provide a welcome message before they start. This is a small side-effect of running another REPL. One can chose to print this output or throw it away. Again it is impossible to see if there is going to be any output and when the output is finished. • Magic functions are possible. As explained in Chapter 4 these functions are not desirable for a language parametric REPL. However, since these magic functions are integrated in these existing REPLs there is no way to exclude them.

Ruby: • Specifically requires a statement to end with a new line character. • Can only return output. There is no separate stream for error messages.

Python: • The Python REPL detects that it is not running in the command prompt. To force the python command to go in to the so called ‘interactive-mode’ the additional parameter ‘-i’ has to be provided. • The welcome-message is given via the errorstream. This is unexpected and might let the user think he did something wrong. • The Python REPL returns in some cases no output at all. This makes it even harder for the LP-REPL to detect when it is ready.

41 From the tested REPLs in Chapter 3 Scala was, next to Python and Ruby, also a REPL running in the command prompt. This REPL was also loaded, but the REPL was unable to run statements when called via a Java interface. This suggests that not all REPLs run via command prompt can be run via this LP-REPL.

Interface conclusion Running the command prompt REPLs proved to be possible in two out of three REPLs. For the two working situations the user is able to perform all expected functions. However, because of all the small language differences the interface had to be tweaked several times to work. If other REPLs possess even different behaviour it might be hard to find a interface that works for all situations. Imagine a REPL that requires a statement not to end with a newline character, this will conflict with Ruby. Secondly although the LP-REPL can run all these commands it is not really user-friendly. It is unclear when a statement is finished, and therefore statements and output quickly start to mix up. This results in that the user can no longer properly see which output belongs to which statements.

5.3.2 Java Evaluators The Java implementations were less troublesome than the Command Evaluators. All three could be implemented to make proper use of the output and error stream. Rascal and Jatha return the result of the execution. This has the advantage that the LP-REPL knows the difference between output (like printed from a print command) and a result. This could in turn be used to implement the feature saved outcome. Jython however just returns whether the execution was a success or not and sends the result automatically to the output stream. Some notable results of the three implementations:

• Rascal in some cases requires a semicolon at the end of a statement. While this is normal for the language in an editor, it is unexpected behaviour in a REPL.

• Jython prints after every output an additional newline, this results in a blank-line after each executed statement.

• The running Jatha implementation returns an error as a return result. We had to perform a trick to catch the error, and display it via the errorstream.

No other side effects were observed for the three Java evaluators, meaning that additional small tweaks like those that were necessary with the command evaluators are not necessary. Appendix C.1 shows the full interface for the Java evaluators. Like the Command Interface the language-implementer only has to give two things: the name and in this case the evaluator. The evaluator should hold to a specific interface so that the difference between Errors, Output and a return value is clear for the LP-REPL. Listing C.2 shows the minimal requirements for the interface to which an evaluator should hold.

5.3.3 Rascal DSL Evaluator The Rascal DSL interface is similar to the Java Interface except for the fact that there is a default implementation present. This implementation makes sure that features in the dependency category A are automatically implemented as well as Error reporting and Syntax highlighting with default color mapping. The language designer can choose to override any of the defined functions and provide the LP-REPL interface with specific information about the language.

42 Since the DSL evaluator is basically the same as the Rascal Evaluator there was no trouble implementing this.

5.3.4 Interface analysis Looking at the interfaces we can see that there are some additional pieces of information re- quired. Examples are the ability to reset and abort the evaluator. Also some of the features require more information, for example in the case of finished statement detection: what to do when the statement is not finished? Should there be a new line inserted or not? If so, should the line be indented or not? The answer depends on the language. This information requirement did not arise in our analysis because in our analysis we as- sumed the simplest possible case. Only by actually creating the LP-REPL we could see this need for additional information. This need, sometimes arose from better use-case scenarios, but also from the differences between languages. An example of this last case is the trimming of output to prevent empty new lines after every result. The Command-interface has the limitation that the information is loaded via an XML-file and can therefore only be provided with static information2. The Java and Rascal interfaces offer better support for dynamic data, like functions. Overall the proof of concept is a working REPL that can instantiate several languages and provide most of the functionality of the language (difficult topics are importing, etc.).

5.4 Summary

The proof of concept shows a fully functional Language Parametric REPL. The user has full access of the REPL and the ability to quickly switch to another language. The REPL offers several features that are enabled depending on the provided information. Compared to our theoretical analysis in the previous chapter the LP-REPL conforms to the minimal standards. However, it also showed that for a better integration and usability the number of dependencies starts to grow. This is true for the basic REPL part, as-well as the feature-part. These dependencies are hard to predict since they can differ per language. Next to all the small language differences one problem is the different ways Evaluators or REPLs use to return their output, errors and results. This makes it hard to provide a unified experience.

2Note that these REPLs could also be called via Java or Rascal, but in that case they would belong to the Java/Rascal category

43 Chapter 6

Conclusion

The REPL has been affirmed in the literature as a useful tool that allows programmers to use, learn, explore, and test a language. The term itself is applied in different contexts and for that reason we provided a more precise definition, which prescribes the term to be used for, and only for, programming languages. Many different REPLs exist, but we discovered that the biggest difference lies in the features they possess. We gave an overview of 24 selected features implemented by thirteen REPLs. There is no pattern among the features and REPLs, no REPL possesses all features, and only two features (Error Reporting and Command History) are implemented by all REPLs. This variation between feature sets can be explained by two factors. Firstly, every language offers different functionality that can make it easier, or more logical, to implement certain features. Secondly, REPLs are used in different domains. Some are only used to run code for example, while others are mainly used to inspect code. This makes certain features more usable than others. To investigate the parametrization of a REPL we made an analysis, both theoretical and practical, of the language dependencies. The basic Read-Eval-Print part only needs an evalu- ator. We divided the features in four different categories according to the amount of language information required. In total, there were eight features that had no dependencies. Meaning that with just an evaluator, one could create a language-parametrizable REPL that has built-in support for at least these eight features. Two features require the syntax to work (category A) and five features require some form of static information (category B). Seven features are so complex that the language implementer should provide the functionality himself, which could, via an interface, be used for the LP-REPL (category C). Our theoretical analysis is based upon a theoretical implementation and this is a restriction of the research. Even though we have not discovered any counter-evidence in our proof-of- concept, it is possible that better implementation techniques exist that could reduce language dependencies. Some features of a REPL are a research topic by themselves, especially the features in category C. These features are not implemented in our proof-of-concept, so they rely on the theoretical analysis. Apart from dependencies, the relations among features were analysed. A lot of features share information with the command history. Some features would benefit from integration with an editor. We found this striking because integration with an editor is rare, as we have seen in our survey. This is certainly a place where REPLs could improve in their usability. The proof-of-concept showed a noticeable effect of the language dependencies on the tight- ness of integration versus the language dependencies. It is easy to predict basic dependencies, but the tighter integration gets, the more dependencies arise. Languages have different proper- ties and there are many different use cases, which require specific information. Since this is hard

44 to predict —every language can have some unique property— it is hard to fully parametrize a tool. Languages do not let themselves be stored in data files; they are not black boxes that require just a simple input and output interface. Instead, they require many interfaces so that they can convey all their information. This is in line with the reason why there are so many dedicated language-specific tools and it is the challenge for language workbenches. We believe to have given a thorough overview and analysis of the REPL and its parametriza- tion. Future research may include finding proper ways to parametrize a language and continuing the investigation of more difficult features (category C) of the REPL. The information offered about parametrization can be useful for any software development tool, but becomes more and more relevant with the rise of language-oriented programming and language workbenches.

45 Acknowledgements

This thesis has been a personal trial for the last 18 months. That it is finished feels like a miracle, and now that I send it off in to the world there are a number of people I need to thank. Firstly I would like to thank my supervisor, professor Dr. Paul Klint. His thoughts and feedback helped me pave the road through the immense REPL/thesis-jungle I found myself in. And if it weren’t for his fast feedback this thesis would have dragged on that much longer. Secondly I thank Vlad Lep and Jasper Timmer for their brainstorm sessions with me. Your thoughts have helped me form this thesis. I also thank Ravish Gopal, Mart Hagenaars, Koen Hanselman and Johanneke Lamberink for their thoughts and proofreading of earlier versions of this thesis. I thank Lieke Smit for putting through with me on this personal test, and putting me back on track when I got lost again. My parents can also not be forgotten. I thank them for their continuous support and providing me with the time and space that I needed to finish this thesis. Your support has been greatly appreciated. Finally I want to thank all the people that showed their support to me during this period. Were it via the promise of Bossche bollen or just by nagging (“is it done yet?”), you have my gratitude. It might at the time not always have been appreciated, it was still good to know you cared.

46 Bibliography

[1] Alfred V Aho, Ravi Sethi, and Jeffrey Ullman. : Principles, Techniques, and Tools. Addison Wesley, 1985.

[2] Eric Allen, Robert Cartwright, and Brian Stoler. Drjava: A lightweight pedagogic envi- ronment for java. In ACM SIGCSE Bulletin, volume 34, pages 137–141. ACM, 2002.

[3] Mike Bierlee. Code completion framework for rascal developers. http://dare.uva.nl/ document/460160, September 2012.

[4] Clojure-doc. Clojure with emacs. http://clojure-doc.org/articles/tutorials/ emacs.html, April 2014.

[5] Eclipse. Eclipse - the eclipse foundation open source community website. http://eclipse. org/, May 2014.

[6] Emacs. Gnu emacs. http://www.gnu.org/software/emacs/, April 2014.

[7] Sebastian Erdweg, Tijs van der Storm, Markus V¨olter,Meinte Boersma, Remi Bosman, William R Cook, Albert Gerritsen, Angelo Hulshout, Steven Kelly, Alex Loh, et al. The state of the art in language workbenches. In Software Language Engineering, pages 197–217. Springer, 2013.

[8] Robert Bruce Findler, John Clements, Cormac Flanagan, Matthew Flatt, Shriram Krish- namurthi, Paul Steckler, and Matthias Felleisen. Drscheme: A programming environment for scheme. Journal of , 12(2):159–182, 2002.

[9] Firebug. Firebug web development evolved. http://getfirebug.com/, December 2013.

[10] Matthew Flatt, Robert Bruce Findler, Shriram Krishnamurthi, and Matthias Felleisen. Programming languages as operating systems (or revenge of the son of the lisp machine). In ACM SIGPLAN Notices, volume 34, pages 138–147. ACM, 1999.

[11] Github. cwi-swat/rascal. https://github.com/cwi-swat/rascal, June 2014.

[12] Github. lappie/lp-repl. https://github.com/lappie/LP-REPL, June 2014.

[13] Haskell. Ides. http://www.haskell.org/haskellwiki/IDEs, April 2014.

[14] Eero Hyv¨onenand Eetu M¨akel¨a.Semantic autocompletion. In The Semantic Web–ASWC 2006, pages 739–751. Springer, 2006.

[15] Jatha. Jatha - common lisp library in java. http://jatha.sourceforge.net, December 2013.

[16] Jython. The jython project. http://www.jython.org, December 2013.

47 [17] Paul Klint, Tijs Van Der Storm, and Jurgen Vinju. Easy meta-programming with rascal. In Generative and Transformational Techniques in Software Engineering III, pages 222–289. Springer, 2011.

[18] David A Kranz, Robert H Halstead Jr, and Eric Mohr. Mul-t: A high-performance parallel lisp. ACM SIGPLAN Notices, 24(7):81–90, 1989.

[19] Kurt Nørmark. Systematic unit testing in a read-eval-print loop. J. UCS, 16(2):296–314, 2010.

[20] Oracle. Jtextpane (java platform se 7 ). http://docs.oracle.com/javase/7/docs/api/ javax/swing/JTextPane.html, May 2014.

[21] oracle. Processbuilder (java platform se 7 ). http://docs.oracle.com/javase/7/docs/ api/java/lang/ProcessBuilder.html, June 2014.

[22] Python. 2. using the python interpreter. http://docs.python.org/2/tutorial/ interpreter.html, December 2013.

[23] Python. 24.6. idle. https://docs.python.org/2/library/idle.html, April 2014.

[24] repl.it. help. http://www.repl.it/help, April 2014.

[25] repl.it. repl.it. http://www.repl.it, April 2014.

[26] Ruby. Ruby in twenty minutes. https://www.ruby-lang.org/en/documentation/ quickstart/, April 2014.

[27] Scala. A taste of 2.8: The interactive interpreter (repl). http://www.scala-lang.org/ old/node/2097, December 2013.

[28] Google Scholar. repl programming language. http://scholar.google.nl/scholar?as_ ylo=2010&q=read+eval+print+loop+&hl=nl&as_sdt=0,5, April 2014.

[29] Sherry Shavor, Jim D’Anjou, Scott Fairbrother, Dan Kehn, John Kellerman, and Pat McCarthy. The Java Developer’s Guide to Eclipse. Addison-Wesley Longman Publishing Co., Inc., 2003.

[30] SLIME. Slime: The superior lisp interaction mode for emacs. http://common-lisp.net/ project/slime/, April 2014.

[31] Wikipedia. Autocomplete. http://en.wikipedia.org/wiki/Autocomplete, December 2013.

[32] Wikipedia. Brace matching. http://en.wikipedia.org/wiki/Brace_matching, Decem- ber 2013.

[33] Wikipedia. Command history. http://en.wikipedia.org/wiki/Command_history, De- cember 2013.

[34] Wikipedia. Read eval print loop. http://en.wikipedia.org/wiki/Read%E2%80%93eval% E2%80%93print_loop, December 2013.

48 Appendix A

Choice of REPLs

REPL name Language Version Tested OS Reason DrJava Java drjava-20130901-r5756 Windows Literature, nr of features DrRacket Racket 5.3.6 Windows Literature, nr of features IDLe Python 3.3.2 Windows History DreamPie Python 1.2.1 Windows Nr of features IPython Python 1.1 Linux Unique features WinGHCi Haskell 1.0.6 Windows History Hugs Haskell sep 2006 Windows History Internet Explorer Javascript 11 Windows REPL domain Firebug Javascript 1.12.6 Windows REPL domain Chrome Javascript 32.0.1700.107 Windows REPL domain IRB ruby 1.9.3 Windows History Matlab Matlab R2013a Windows Unique REPL position Scala Scala 2.10.3 Windows History

Table A.1: An overview of the chosen REPLs, the language they host and the reason why it was chosen.

49 Appendix B

REPL hands-on experience

This section provides a quick hands-on of the experience with each REPL.

DrJava

DrJava calls it REPL an ‘interaction pane’. DrJava focusses on students who start program- ming and this is visible in the user-friendly error messages as well as the additional feature of suggesting which file might be necessary to import if a code-entity is unavailable. While going through code with the debugger the user has the ability to still use the REPL and call variables to display their value, or they can use it to set variables to different values.

DrRacket

The successor of DrScheme offers a nice editor consisting of two parts, one editing box and one REPL. The code from the editing box can be executed and then used in the REPL. A very exclusive feature was the ability of DrScheme to display images in the REPL. The code ‘(circle 10)’ would give an output line with a circle of 10 pixels in it. Another striking feature was the documentation-provider, by placing the cursor on a function and pressing F1 the browser would open and display the documentation about that function. DrRacket also allows to load different dialects. From previous implementations of the lan- guage to different language-levels that would be easier to comprehend for students. The integrated editor with DrRacket offers a debugger, but this debugger is not available in the REPL.

IDLe

IDLe is the default environment for Python. The history is a bit hidden but can be accessed with a key combination. Although the Python environment, of which IDLe is part, comes with an editor, it is not integrated directly in the REPL. Files are edited in a seperate window.

DreamPie

DreamPie offers a lot of features. Striking for DreamPie are the two ‘boxes’, the code-box where the users enters its code and the history box where a history of the commands plus output is displayed. Although DreamPie offers many features it doesn’t offer any additional support for importing source-files.

50 DreamPie offers some rare features:

• Tabs, multiple REPLs can be open at the same time.

• Folding and unfolding of output.

• Saving a file as an HTML file, which displays the history box.

• Copy code only. Allows the users to only copy the commands from the history section.

IPython

Although IPython also provides a ‘normal’ REPL we tested the ‘Notebook’ version because of its unique character. A notebook is almost like a document where headers and text can be added. The code and its output integrate nicely in this document view. Even visual output can be displayed in this notebook. Although IPython does not have any special keybindings, it does offer magic functions. Special functions that can be called to control the REPL. By calling ‘%man func’ one can see the help about the function ‘func’. There is no search feature, but obviously the browser’s search function is always close. Graphs can be plotted in-line after importing matplotlib (and some tweaking). The regular REPL of IPython contained the same features except for the extensive mark-up functionality.

WinGHCi

WinGHCi is the graphical variant for windows from GHCi which runs in a console. The REPL offers additional features. One of the best features is the way how it keeps track of imported files and can reload these files with the press of a button. This speeds up development for a user when he/she changes something in a source-file.

Hugs

Hugs is one of the most simple REPLs out there, running in a console and offers little features. However it played an imported role in the early days of Haskell. Syntax highlighting does not go further that the printing of errors in red. Despite the lack of other features, hugs includes a special library which can be used to debug objects containing the ability to keep track of primitives and classes and put breakpoints. It requires the user to first let the object that they want to debug implement the class ‘Observable’. It can be argued if this falls under the category debugger.

FireBug

Firebug is a plug-in for browsers that opens a console from which the user can interact with the existing , html and css on the current page. We tested Firebug on its original environment: Firefox. Firebug has a special command bar where the user can type its input, this bar is a bit limited in functionallity. It does not have bracket matching or syntax highlighting where the output panel does. However, one can extend it, or when a multiple line statement is pasted

51 the console automatically goes, to a special editor pane to the right of the console. Giving the user the space for multiple statements. This editor does allow syntax highlighting and bracket matching. Firebug does not automatically save the latest outcome to a predefined variable. However, by right clicking the user can assign this outcome to the variable $p. Also inspected elements can be assigned to a variable. This would classify for the feature Saved outcome. Hyperlinking of javascript-code is not available. But clicking on a printed html-object will bring you to that element in the given DOM-inspector. Errors are linked. The help does contain hyperlinks to a function description. But this is help for Firebugs magic functions, not any native Javascript functions. It wouldn’t make much sense for Firebug to have graphical output in its REPL since the user is controlling the webpage that is visible just above it. However, Firebug can show more than just text. The function help shows a table. Firebug does allow console messages to be styled using css, showing that the console is capable of graphical output. However, we can not display a graphical object, except for text, in the console.

Chrome Javascript Console

The Javascript Console that comes integrated with Chrome is similar to that of Firebug. Chrome does not allow clicking on functions. However, if one uses the debug-function console.log there is a hyperlink on the output bringing the user to the part of the code where the function was called. The same goes for error messages, not all messages are linked, but some of them are. Although the console does not hide large pieces of output, chrome does allow hiding elements of the error message. Like Firebug Chrome does allow styling of messages, but not the display of objects. The Chrome REPL comes with a specific Javascript library of functions. For example clear() to clear the console. We do not concider this magic functions since it is part of the language and handled by the evaluator. Chrome handles HTML a bit different than Javascript in the REPL. Javascript is not high- lighted, nor can it be folded. Html is syntax-highlighted and can be folded and unfolded.

Internet Explorer Javascript Console

The Javascript Console that comes integrated with Internet Explorer is similar to both Firebug and Chrome. Like Firebug, Internet Explorer has a separate command box that can be extened to handle multiple statements. And Like Chrome, Internet Explorer syntax highlights html but not the JS itself. This html can be code folded.

IRB

IRB, the REPL for Ruby, runs also in a command prompt. A notable feature is the fact that a user can start a subsession with a magic function. This allows users to run sessions next to each other, keeping different states in every session.

52 Scala

The Scala REPL runs in the command prompt. There are no notable features or differences compared to other REPLs.

Matlab

Matlab has the unique property that everything is build around its REPL. It allows basic syntax highlighting and supports many of the researched features. Although Matlab has its own language it is hard to dtermine if a function has been added to the language specifically for the REPL. However there are some commands that are used to control the environment as opposed to be just a programming-scope. These were classified as magic functions. Unique about Matlab is its feature to interact directly with source-files (direct interaction). Matlab solves the problems that comes with this feature by sort of splitting the language in two. Source-files may contain function-definitions, while declarations can only be made in the REPL itself. This way when a statement is evaluated the language can always grab the latest version from the source-files without worying about the scope.

53 Appendix C

Java Interface

Listing C.1: The interface to provide language-specific information to the proof of concept im- plementation. public interface ILanguageSettings { public IEvaluator getEvaluator();

public String getPostUnfinishedStatement();

public boolean hasFunctionHelpCommand();

public IFunctionHelpProvider getFunctionHelpProvider();

public String getFileExtention();

public String getLanguageName();

public boolean hasImport();

public boolean hasWorkspace(); }

54 Listing C.2: The interface to which the evaluator should comply in order to work with the proof of concept implementation. public interface IEvaluator { public boolean isComplete(String statement);

public void clear();

public AbstractResult execute(String statement);

public boolean terminate();

public AbstractResult doImport(String module);

public String getName();

public String getLanguage();

public void load(REPLOutputStream out, REPLErrorStream err);

public void setWorkspace(File workspace);

public File getWorkspace();

public void close(); }

55