Developing a common web interface to various verification tools Bachelor’s Thesis Report

Roland Meyer [email protected]

Supervised by: Malte Schwerhoff Prof. Dr. Peter M¨uller

ETH Z¨urich July 27, 2012 Abstract This Bachelor’s thesis introduces Tuwin, a system that allows a user to try out a command-line based tool, e.g. a verification tool or a program analyzer, online, without having to install it. This is achieved by giving such tools a common web interface similar to Microsoft Research’s rise4fun [1], such that users can execute the tools from their browsers. The system consists of three parts; a central server, a graphical web interface and a number of remote web applications, the tool hosters, where each of them hosts a tool. A user can send input data such as to the server which redirects it to the corresponding tool hoster. The tool hoster executes its tool with the user’s input and eventually returns its output through the server to the user. All communication between the three components uses standard HTTP/1.1 and messages are JSON [2] or MIME encoded. The server relies on a database for logging and both the server and the tool hoster are configurable through a configuration file. Through this file the tool hoster supports features such as user-selectable command-line parameters, multiple output files of different types, e.g. images or PDF files, or syntax highlighting the input text. Contents 1 Introduction 2 1.1 Motivation ...... 2 1.2 Goals ...... 2 1.3 Outline ...... 3 2 System Overview 3 3 User interface 4 3.1 Admin page ...... 7 4 Execution Example 8 5 Implementation 10 5.1 User Interface ...... 10 5.2 Tool Hoster ...... 11 5.2.1 Configuration ...... 12 5.2.2 Languages ...... 16 5.2.3 Examples ...... 16 5.2.4 Options ...... 17 5.2.5 Results ...... 18 5.3 Tuwin Server ...... 20 5.3.1 Configuration ...... 21 5.3.2 Database ...... 22 5.3.3 Permalinks ...... 22 5.4 Error handling ...... 23 5.4.1 Error messages and classes ...... 23 5.4.2 Logging and displaying errors ...... 24 6 Communication Protocol 25 7 Quality Assurance 28 7.1 Project management ...... 28 7.2 Documentation ...... 28 7.3 Testing ...... 28 7.3.1 Unit tests ...... 28 7.3.2 Stress tests ...... 28 7.3.3 UI compatibility ...... 31 7.4 Security ...... 31 8 Conclusions 31 8.1 Future Work ...... 32 8.2 Acknowledgments ...... 33 A Database tables 36 B List of Error Codes 39

1 1 Introduction

1.1 Motivation Many research groups in computer science develop various tools such as code verifiers, compilers or program analysers. These tools are often not officially released, i.e., they are not available in a stable major version since they are under constant development. There may be frequent or unforeseen changes to the APIs, and dependencies and build processes may not be thoroughly documented. This can make it difficult for people interested in using these tools as they have to keep track of the latest version and find out how to build and use it, or they might not even be allowed to install the tool due to legal issues.

1.2 Goals The goal of this Bachelor’s thesis was to develop a web service that would allow an interested user to find a command-line based tool and use it without having to download and install it on his/her own computer. The web service should work in a similar way to Microsoft Research’s rise4fun [1]. The user interface should be easy to understand and easy to use, such that a user could find a tool, enter their code and run it with just a few clicks. A tool developer should be able to add his/her tool to the service with as little effort as possible. The system should be configurable such that it can adapt well to many different tools. A tool developer should have the possibility to provide example input code to their users as well as to allow them to use command-line parameters. Tools that produce multiple output files should be supported. The communication in the system should use a well-defined protocol. Output from a tool should be displayed to the user as soon as it is available, i.e., intermediate results have to be visible. All user input and output from the tools, as well as errors, should be logged in a centralized way to allow debugging. The web service was to be implemented in the Scala programming lan- guage [3] and the implementation was to be well-documented and thoroughly tested to facilitate future maintenance. Security and performance were no key issues but were considered throughout the course of the project.

2 1.3 Outline The product of this Bachelor’s thesis is called the Tuwin system or Tuwin for short. It stands for Tool Use Without Installation. Section 2 gives an overview of the Tuwin system and explains its components. Section 3 shows screenshots of the user interface and explains how it is used. Section 4 shows an execution example to give an impression of how the components com- municate and section 5 discusses implementation details. Section 6 explains the Tuwin communication protocol in detail and is followed by section 7 dis- cussing quality assurance and finally section 8 which concludes the report with a brief discussion about future work.

2 System Overview

The Tuwin system consists of three conceptual components; the user inter- face, the server and the tool hosters (figure 1).

Communication Communication (HTTP and JSON) (HTTP and JSON) Config file Tuwin Server Tool Hoster A

UI Config file Tool Hoster B

Config file

Database Config file Tool Hoster C

Figure 1: The components of Tuwin

The Tuwin tool hoster is a web application of which a developer can install an instance on his computer and configure it to use a tool that he/she developed and installed on the same machine. In this way the tool hoster provides a web interface to that tool, allowing external users to feed input to it and receive back output. There may be a large number of tool hosters in the system, one for each tool. The Tuwin tool hoster is a sample implementation of a tool hoster, designed to be as generic as possible, but tool developers are not bound to use it. Instead they may want to use their own implementation which behaves according to the communication protocol. The Tuwin server is the central unit. It serves as a mediator between the users and the tool hosters by providing a user-friendly web interface

3 (see section 3). The server redirects user requests to the corresponding tool hosters and monitors the execution there by polling. It relies on a database to log input, output, and errors. Even though it may be a bottleneck, there are a number of reasons why the server is used as a middle-man. By not letting the user interface communicate directly with the tool hosters the server can ensure that all communication with the tool hosters correspond to the protocol. This is both a security measure as well as a measure of making sure that the logging works correctly. It also adds flexibility because tool hosters can be exchanged without the users noticing. The communication between the three components uses standard HTTP and JSON [2] encoded messages, as well as file downloads. Both the server and the tool hosters are configured through a JSON encoded configuration file. In this report the terms tool hoster and (Tuwin) server may refer to the components as such or to the computers they are running on. The user interface or UI is part of the server but is treated as a separate component because its code is executed on the client, i.e., the user’s computer. When a user enters the Tuwin server’s URL in their browser they are presented with a list of all registered tools (figure 2). Upon clicking a tool from this list they are redirected to an editor page (figure 3) where they can enter their input. Once they click the “Run” button this input is sent to the server using its HTTP interface. The term HTTP interface refers to a component’s set of HTTP paths used for communication and is not to be confused with the user interface. The server forwards the request to the tool hoster’s HTTP interface and the tool hoster executes its tool using the user’s input. In order to know when the execution is finished, the server repeatedly polls the tool hoster to find out about the current execution status. The user interface code does the same with the server. As soon as output is available on the tool hoster, the server fetches it and stores it locally. The user interface code then fetches the output from the server and displays it to the user. In this report output is referred to as a result (file) or multiple result files.

3 User interface

The first thing users see when they access the Tuwin service in their browsers is the tool selection page (figure 2). It contains a list of available tools and each tool is presented as a clickable box with its name and description inside. On top of these boxes there is an input field which can be used to filter the tools.

4 Figure 2: Screenshot of the tool selection page

When the user selects a tool from the list they are taken to the editor page (figure 3). At the top the tool’s name and description, and if available its author and version number, are displayed. Below it is a large text area, the code input area, where users can enter their code. Initially this text area contains example input to give the user an impression of what the tool is about. On the right hand side there is a sidebar labeled “Examples”. It contains a list of example input files. When the user clicks an entry from this list the example’s content is copied into the code input area. Below the sidebar there are two buttons. One of them is labeled “Examples” and the other one “Options”. These buttons are used to control the sidebar to toggle between the list of examples and the tool options. The latter is a list of form elements such as checkboxes and input fields which give the user the possibility to pass

5 Figure 3: Screenshot of the editor page where users can run a tool options, which have been declared by the tool hoster, to the tool. The form elements’ types correspond to the types of the options, i.e., checkboxes for Boolean options, dropdown list for list options . . . On the left-hand side, below the code input area, users can find the “Run” button. Clicking it will start a session with the server, which in turn starts a session with the corresponding tool hoster. Until the tool is finished (or fails) a running indicator (spinner) is visible next to the “Run” button. As soon as results are available, which may be even before the tool finishes, a box appears below the “Run” button. This is where all results are displayed. If the tool produces multiple result files they can be accessed by clicking the tab headers at the top of this box. Once the tool finishes a tab labeled “Run Information” is added. Clicking it reveals information about the execution such as its duration and the exit value. Furthermore, upon termination a new button appears next to the “Examples” button labeled “Permalink”. Clicking it opens a dialog with a URL in it. A user can copy this URL to

6 gain permanent access to the current execution, e.g., to bookmark it or to show it to others (see section 5.3.3).

Figure 4: Example of an error dialog, informing the user that the execution could not be started because of a network problem.

If an error occurs, at the start of or during the execution, an error dialog (figure 4) is displayed to inform the user.

3.1 Admin page In addition to the tool selection page and the editor page the server provides an admin page (figure 5). For security reasons there are no links to it on the other pages and it can only be accessed if the admin secret, a secret passphrase set by the server’s administrator, is provided. The admin page allows the administrator to register new tool hosters to the server, as well as to unregister old ones. To add a tool hoster, all he/she has to provide is the tool hoster’s base URL, i.e., the URL from which the server can pull information about the tool, and a tool tag, i.e., a short unique name for the tool, containing only letters, digits and underscores. After clicking the “Add” button, the administrator is presented with a reload URL. Opening it in the browser will make the server reload the information it stores about the tool from the tool hoster. It should be given to the tool hoster’s administrators so they can tell the server to reload whenever they made changes to their tool. Tool hosters can be removed from the server by selecting them in the list on the admin page and clicking the “Remove” button. Other changes to the server’s configuration, e.g., changing the access in- formation for the database, have to be made manually in the server’s con- figuration file (see section 5.3.1). The admin page features a button, labeled “Reload”, which allows the administrator to reload this file after changes have been made, without having to restart the server.

7 Figure 5: Screenshot of the admin page

4 Execution Example

This section illustrates how the communication protocol (see section 6) is used by giving an example. Figure 6 shows what information is passed be- tween the components of the Tuwin system when a user wishes to run the tool “Foo”, which has one option with option tag “bar” (see section 5.2.4). The user first navigates to the editor page of the tool “Foo”, enters input code and choses a value for the one option that is available. Once they click the “Run” button the client side Javascript sends off an HTTP POST request (using ) to the server’s /comm/foo/run method. The request includes the parameters “input” and “option bar”, the latter one containing the value for the tool’s option “bar”. The server then mimicks this behavior by forwarding the input and option values to Foo’s tool hoster. The tool hoster now starts a new session by starting Foo in a new process. It generates a random session key and returns it in the response. When the server receives this response it starts its own session with the same session key, then returns the key to the client side script in the response to the original run request.

8 User Interface HTTP requests and responsesServer HTTP requests and responses Tool Hoster for "foo" User clicks on POST /comm/foo/run "Run" button input, option_bar POST /run input, option_bar

Response Session starts key Process starts Response Session starts key Polling starts Polling starts GET /run-status key GET /comm/foo/run-status Response key run-status (running) Tool produces result file Response run-status (running) Process terminates GET /run-status key

Response run-status (finished, 1 result) Polling finishes

GET /result GET /comm/foo/run-status key, result_id key Response Response result file run-status (finished, 1 result) Store result file Session expires Polling finishes GET /comm/foo/result key, result_id

Response result file Session expires Display result

Figure 6: Details of an execution of tool “foo” in the Tuwin system

From now on the server and the client follow the same strategy, namely they are polling the run-status. The server repeatedly makes HTTP GET requests to the tool hoster’s /run-status method, including the session key as a parameter to identify itself. The response from the tool hoster, i.e., the run-status, contains a JSON object with a field named “status”. While the value of this field equals “running” the server continues polling, with a short waiting interval of 1.5 to 3 seconds between two requests. Once the process terminates (assuming it does not timeout, fail or is aborted) the tool hoster changes its run-status information such that future polling responses contain a “status” field with the value “finished”. This is the signal for the server to stop polling. The client script does the same; it repeatedly polls the server until the

9 returned run-status indicates that the process has finished. The server keeps its own copy of the last run-status it received from the tool hoster, which it returns to the client when requested. In addition to the “status” field the run-status also contains a “results” field. As soon as a result file is available on the tool hoster it shows up listed in this field’s list. Upon receiving a run-status response with a new result entry, the server pauses its polling and instead requests this result file by means of an HTTP GET request to the tool hoster’s /result method. The received file is locally stored on the server and the server continues polling. The client behaves similarly, requesting result files from the server as soon as they show up in the results list in the run-status. Instead of storing them the client presents them to the user. Eventually, the sessions expire on the server and the tool hoster, meaning they can no longer be accessed through the HTTP interface.

5 Implementation

This section discusses implementation details of the three parts that make up the Tuwin system; the user interface, the server and the tool hoster. Furthermore it includes a short section on error handling and logging.

5.1 User Interface The term user interface or UI refers to the web pages provided by the server, i.e., the tool selection page, the editor page and the admin page. Users can access the UI in their browser as it consists of standard HTML, CSS and Javascript. The content is dynamically produced by the server using Scalate [4], a Scala-based template engine which allows to render HTML pages on- the-fly using (Scaml [5]) template files. Client-side scripting is done using Javascript and the jQuery UI library [6], which builds on the jQuery library [7] and extends Javascript with many useful UI widgets and communication abstractions, is used for some of the functionality such as displaying error dialogs and the result tabs, as well as for communication with the server by making asynchronous HTTP requests, i.e., AJAX (asynchronous Javascript and XML) requests. The code input area is enhanced with features such as syntax highlighting and line numbers by the Javascript plugin CodeMirror [8]. CodeMirror uses a separate script file, written in Javascript, which contains the information on how to apply syntax highlighting. This file is referred to as a language file and it is provided by the tool hoster as will be discussed in section 5.2.

10 The admin page contains no client-side scripts and the tool selection page only a few lines of code for the filter bar. The editor page however, being the core part of the UI, is more complex. When the editor page is loaded the browser sends off two AJAX requests to the server, one to get the available examples and one for the options. A random example is then displayed in the code input area. When the “Run” button is clicked the content of the code input area and the values of all the option fields are sent to the server, where a session is started. The server responds with a session key which the client code then uses to poll the server repeatedly, leaving an interval of 1.5 seconds between two poll requests. For performance reasons the interval increases over time up to 4 seconds. The polling is done using AJAX requests as well and continues until the server responds with a status other than “running”, i.e., it finished, failed, timed out or was aborted (see section 6 for details). The polling response, i.e., the run-status, is a JSON encoded object, and apart from the execution’s status it also contains information about what result files are available on the server. Depending on their (MIME) type they are requested and displayed in tabs. Plain text and images are displayed as such and PDF files are embedded. There is also a special MIME type “text/tuwin-table” which is displayed as a clickable list (see section 5.2.5). Result files of other types are not displayed but instead a link is shown to allow the user to download it.

5.2 Tool Hoster The Tuwin tool hoster is a sample implementation of a tool hoster. It is a web application implemented in the Scala programming language that makes heavy use of the Scalatra web framework [9], which builds on Java servlets [10]. It contains one Scalatra servlet, the tool servlet, which provides the tool hoster’s HTTP interface. Using this interface, a client, typically the Tuwin server, can start a ses- sion with the tool hoster. At the beginning of a session the client sends code input and a number of option values to the tool hoster. A new unique random session key, consisting of the current date and time and 24 random hexadecimal digits, is produced and returned to the requester to be used as a handle to the session in future requests. A session directory is created and the code input is stored in a file within that directory. The tool is then started in a new thread using the standard Java Process library [11]. During a session, this thread keeps track of the tool’s state and waits for it to ter- minate. This information, i.e., the run-status, can be accessed through the HTTP interface. It is used to poll the tool hoster to find out when the tool execution terminates and to find out about (intermediate) results.

11 In addition to the thread hosting the process, for each session there is a second thread that monitors the running time. It sleeps until the running time exceeds a certain timeout. Once that happens it wakes up, aborts the process and updates the run-status information accordingly. On Windows systems processes are aborted using the open-source library Winp [12] which serves as a wrapper around the Java process. The reason for this is that Winp provides a method to recursively abort all child processes that were forked during execution. Java processes lack this feature, leading to the problem of orphan processes. On non-Windows systems this is still an open issue and should be addressed in future work. Currently however, this is acceptable since Tuwin is only installed on a Windows server. The Tuwin protocol does not specify in which order result files have to be requested. Because of this, and also to account for (network) delays, sessions are not removed from the tool hoster immediately after they terminate but instead the tool hoster waits for an expiration timeout. All result files should be requested during this time. After the timeout the session is removed. The session directory however is not deleted as it may be of interest to the tool developer. In fact all input and output files are stored indefinitely, allowing the tool developers to see how their tools are used and what problems may arise with them. This possibility might prove to be quite useful for developers to refine their software. The tool hoster’s servlet also provides HTTP methods to get information about the tool, such as name, description and version number, but also about example files, available options or the language file. See section 6 for a complete description of the servlet’s HTTP interface and its protocol and section 4 for an example execution.

5.2.1 Configuration Since the tool hoster is designed to be as generic as possible it can be config- ured in many ways. A configuration file stores all this information in JSON format on the tool hoster’s system and the servlet context, typically a con- text file for the servlet container, must contain the path to this file. Table 1 lists and explains all parameters this file may contain and listing 1 shows an example of what it may look like. To further enhance genericity all file and directory paths within the configuration file may be absolute or relative with respect to the base directory of the servlet container. A documentation file is available for the tool hoster which contains more details.

12 name The name of the tool as it is displayed to the user. description A short description of the tool. It is displayed on the tool selection page as well as above the code input area on the editor page. Restricted HTML markup is allowed (see the tool hoster documentation). version Optional. The version number of the tool which is displayed next to the tool’s name. author Optional. The name of the developer or organization which is displayed next to the tool’s name (after the version number). language Optional. This is used for syntax highlighting by the CodeMirror plugin. See section 5.2.2 for more information. command The command or path to the tool in- cluding command-line parameters. The following placeholders may be used:

• $input$: The input file.

• $outputX$: The X th result file (zero-based).

• $options$: The parameter string given by the options. See section 5.2.4.

• $dir$: The session directory.

• $configdir$: The directory where the configu- ration file is located.

At least $input$ should be included to tell the tool about the user’s input.

13 fetch output Optional. A Boolean value. If set to true the tool hoster will redirect the standard output from the tool into the first result file. This should be set to false or left out if the tool handles file output by itself or if pipes are used to redirect output into the file. directory Optional. The path to the directory where the tool hoster stores all input and result files. If not set, a temporary directory is used. results A list of result files the tool can produce. The list should contain at least one entry which is used to store the standard output from the tool. See section 5.2.5 for more information. options Optional. A list of options that the user has when executing the tool. See section 5.2.4 for more infor- mation. examples Optional. A list of example input from which the user can choose. See section 5.2.3 for more informa- tion. timeout Optional. The time in seconds after which a running tool is automatically aborted and the run-status’ “status” field is set to “timeout”. If not set, a de- fault timeout of 5 minutes is used. max file size Optional. The maximum file size in bytes a result file may have. If a result file exceeds this value it is not sent and a warning is inserted into the run- status. If not set, a default value of 500 MB is used. The default value is also an upper bound for this parameter. concurrency limit Optional. The maximum number of concurrently running sessions. If this limit is reached new re- quests to execute the tool are rejected until a session terminates.

14 hidden Optional. A Boolean value. If set to true the tool will not show up on the server’s tool selection page but can otherwise be fully used.

Table 1: Configuration file parameters

1 { 2 "name":"Chalice", 3 "version":"1.0", 4 "description":"Chalice isa language for reasoning about concurrent programs with constructs for thread creation, locking, and channels. The language has built- in specification constructs, and specifications are written in the style of implicit dynamic frames with fractional permissions. Chalice is also an automatic static program verifier. Give ita try!
http:// research.microsoft.com/en-us/projects/chalice/", 5 "command":"\"$configdir$/chalice-nightly/chalice.bat\" $input$ $options$> $output0$ 2>&1", 6 "fetch_output": false, 7 "directory":"sessions-dir", 8 "language": 9 { 10 "mode":"chalice", 11 "file:""../chalice.js" 12 }, 13 "timeout" : 180, 14 "max_file_size" : 536000, 15 "options":[ 16 { 17 "name":"Boogie output", 18 "tag":"boogie", 19 "type":"boolean", 20 "default": false, 21 "true_value":"/print:out.bpl" 22 }, 23 { 24 "name":"Defaults", 25 "tag":"defaults", 26 "type":"integer", 27 "prefix":"/defaults:" 28 "default":0 29 } 30 ], 31 "results":[ 32 {

15 33 "name":"Output", 34 "result_type":"text/tuwin-table", 35 "file_name":"output.txt" 36 }, 37 { 38 "name":"Boogie", 39 "result_type":"text/plain", 40 "file_name":"out.bpl" 41 } 42 ], 43 "examples":[ 44 { 45 "name":"Cell", 46 "file":"../chalice-nightly/examples/cell.chalice" 47 }, 48 { 49 "name":"Fork Join", 50 "file":"../chalice-nightly/examples/ForkJoin.chalice" 51 } 52 ] 53 } Listing 1: Example configuration file for tool “Chalice” [13]

5.2.2 Languages The “language” parameter in the configuration is used for syntax highlighting by the CodeMirror plugin. The rules about how to apply syntax highlighting are defined in a Javascript file, i.e., the language file, and they are referred to by a mode (see the CodeMirror documentation for more information). The “language” parameter consists of the two parameters “mode”, containing the mode, and “file”, containing the path to the language file.

5.2.3 Examples The tool hoster can provide the user with a list of example input. This list is specified by the “examples” parameter in the configuration file. An example consists of a name and a file containing the example input (see table 2).

name The name of the example as it is displayed to the user.

file The path to the file containing the example input.

Table 2: Example parameters

16 5.2.4 Options In addition to the main input, e.g., source code, users may also have the possibility to provide command-line parameters to the tool by using options. Options are displayed on the editor page as HTML form elements and allow users to enter or pick a value. Once they click the “Run” button these values are sent along with their input, and formatted and passed to the tool as command-line parameters. Options are specified by the “options” parameter in the configuration file. The concatenation of all formatted option values is inserted into the “command” string from the configuration file where the placeholder $options$ is used. Every option has a unique tag, a name and a type, as well as optional parameters, some of which depend on the option’s type (see table 3). There are four types of options; Boolean options which are displayed as checkboxes, integer and string options which are displayed as text input fields and list options which are displayed as dropdown menus.

tag A unique tag to identify the option. It should be a short string consisting of only letters, digits and underscores.

name The name of the option as it is displayed to the user.

type The type of the option. Supported types are “boolean”, “integer”, “list” and “string”.

prefix Optional. This string is always included in front of the value. No space is inserted between the prefix and the value.

suffix Optional. This string is always included after the value. No space is inserted between the value and the suffix.

default Optional. This is used as the default value in the web form. In a list option this needs to be the zero-based index of a value from the “values” parameter.

true value Boolean option only. This string is used as the value if the user selects true for this option.

false value Boolean option only, optional. This string is used as the value if the user selects false for this option.

17 values List option only. A list of values from which the user can choose one. Each entry in the list consists of a parame- ter “name” and a parameter “value”. The names are dis- played to the user, the values are only used to construct the command-line parameter.

escaped String option only. If set to true, only letters, digits, un- derscores, dashes and slashes are allowed in the option value. All other characters are removed. This can be used as a security measure to disallow users to insert arbitrary command-line parameters.

Table 3: Option parameters

It is important to note that in the case of string options, in the value received from the user backslashes and quotes are escaped for security rea- sons. It is highly recommended to also use the prefix and suffix to surround a string option’s value with quotes, because otherwise the user may be able to add arbitrary command-line parameters. The tool hoster always checks whether the values entered by users are allowed for the option’s type, e.g., it checks if the value provided for an integer option is indeed an integer. To get an impression of how options are used to produce command-line parameters, the reader may look at the example configuration in listing 1 where two options, “Boogie output”, a Boolean option, and “Defaults”, an integer option, are defined. Assuming the user selects the first option and enters the value 2 for the second, the resulting command-line parameter string would look like this: /print:out.bpl /defaults:2

5.2.5 Results The parameter “results” in the configuration file specifies a list of results. Each entry is a placeholder for a result file the tool can produce. While a session is running, the tool hoster will look for these result files and present them to the user if they are available. The list must contain at least one entry, because the first entry is used to capture the standard output from the tool. This file is available through the $output0$ placeholder in the configuration file’s “command” parameter. Each result entry consists of a name, a type and a file name (see table 4).

18 name The name of the result as it is displayed to the user.

result type The MIME type of the result.

file name The name of the result file.

Table 4: Result parameters

There are certain values for the “result type” parameter of a result that influence how the result is presented to the user: • “text/plain”: The result is displayed as plain text.

• “text/tuwin-table”: The result is displayed as a clickable list where each row may point to a row in the input. See listing 2 for an example of the expected JSON format.

• “image/jpeg”, “image/gif”, “image/png”: The result is displayed as an image.

• “application/pdf”: The result is displayed as an embedded PDF file. Every other result type is treated as a binary file and instead of displaying its content, a download link is presented to the user. { "prolog":"This is the optional prolog message.", "epilog":"This is the optional epilog message.", "lines":[ {"line_nr" : 5,"column_nr" : 1,"icon":"w","message" :"Warning on line 5, column1"}, {"line_nr" : 5,"icon":"i","message":"Information for line 5, no column number"}, {"line_nr": 25,"message":"Message without icon"}, {"icon":"e","message":"Error without line number"}, {"icon":"t","message":"Tick without line number"}, ] } Listing 2: Example to explain the syntax of the “table” result type It is assumed that the tool hoster’s administrator is the developer of the tool he/she hosts and therefore has the possibility to adapt the tool to produce output conforming to the “text/tuwin-table” format. If this is not the case, an adapter program can be used which runs the tool and transforms its output into the right syntax. An example implementation of such an

19 adapter (for the tool Chalice) is included in the source files of the project. See the tool hoster’s documentation for more information.

5.3 Tuwin Server Like the tool hoster, the Tuwin server is implemented in Scala and uses Sca- latra. In contrast to the tool hoster it contains two Scalatra servlets which serve two different purposes; the GUI servlet is responsible for rendering the three pages of the user interface while the Communication servlet provides the actual HTTP interface used by the UI to communicate with the server. The server’s HTTP interface looks very similar to the one provided by the tool hoster. The reason for this is to facilitate maintenance and future de- velopment. See section 6 for details about the HTTP interfaces and their protocol. Similar to the tool hoster the server is configured through a JSON encoded file, the server configuration file (see section 5.3.1) and relies on a database (see section 5.3.2) to log errors, sessions and information about their results. The result files themselves are stored in the file system instead of the database for reasons of performance as they may grow very large and the database connection may be too slow to provide a good user experience. A client, typically the UI, can start a session with the server through its HTTP interface. At the beginning of a session, the server sends the user’s code input and option values to the corresponding tool hoster and requests a new session there. It then returns the session key it obtained from the tool hoster to the user and also uses it for its own session. To avoid session key clashes, the server references sessions internally by the combination of their session key and the unique tag of the corresponding tool. The server communicates with the tool hosters using the Dispatch library [14] to send HTTP requests to their HTTP interfaces. The responses often contain JSON encoded objects which are serialized and deserialized to and from Scala case classes by the lift- library [15]. Like on the tool hoster, a session on the server runs in its own thread. This thread repeatedly polls the tool hoster’s run-status to find out if the tool is still running and if there are result files ready to be downloaded. When a result file is available, the server requests it and copies it into a local session directory. Throughout the course of the session, the client who started it has access to it through the server’s HTTP interface in the same way he/she would have with a tool hoster, i.e., he/she can poll the run- status and request result files by providing the session key. Once a session terminates, the server logs it in the database and removes the session from the system after an expiration timeout. The session directory is not deleted

20 because the result files may be needed for permalinks (see section 5.3.3).

5.3.1 Configuration The Tuwin server can be configured through the server configuration file, similar to the tool hoster. The servlet context must contain the path to this file. It is JSON encoded and consists of the parameters in table 5. A documentation file is available for the server which contains more information about the configuration.

tools A list of tool hosters the server knows. Each entry in the list consists of a unique tool tag, the tool hoster’s base URL and the MD5 hash of a secret passphrase which has to be pro- vided when reloading the tool (see section 6). Optionally a list of secondary URLs can be defined, used for workload distribution, where each of them points to different tool hoster hosting the same tool with the same options.

max file size The maximum allowed size for result files in bytes. Files larger than this value are not sent and a warning is included in the run-status instead.

db type The type of the connected database. The only type cur- rently supported is “mysql”.

db url The full JDBC URL to the database schema, e.g., jdbc:mysql://localhost:3306/tuwin server.

db user The user name to access the database.

db password Optional. The database user’s password.

base dir Optional. A path to a directory where the server stores the result files. If not set a temporary directory is used.

admin secret The MD5 hash of a secret passphrase that should only be known to the administrator. This secret has to be entered whenever the administrator wants to access the server’s admin page.

Table 5: Server configuration parameters

21 5.3.2 Database The server uses a database to log information about sessions, results and errors. It makes use of the C3PO library [16] for connection pooling, which relies on standard JDBC drivers [17] to connect to the database. The Squeryl library [18] is used to construct prepared statements and abstract away from using SQL. The database pool is initialized when a servlet is first started on the server. There are four database tables the server uses. They are listed in the appendix in section A. The table “sessions” (table 6) is used to store in- formation about sessions and table “results” (table 8) to store information about their results. The table “session options” (table 7) stores the values for each of a session’s options. Error messages and the exceptions that caused them are logged in the table “errors’ (table 9). In the current version of the Tuwin server only MySQL databases [19] are supported. However, support for other types of RDBMS can easily be added if they are supported by Squeryl. Instructions to do so are available in the server documentation.

5.3.3 Permalinks The main reason why the server logs all input and results is to allow perma- links. A permalink is a URL that allows users to restore a session that has already terminated. They can use it to bookmark a certain execution of a tool or to share it, e.g., in a forum. The permalink URL looks like the editor page’s normal URL, only that the session’s key is appended as a parameter (e.g. http://tuwin-server.com/tool/foo/?key=sessionkey). When this URL is accessed in the browser a client-side script requests the logged session’s details by making an asynchronous call to the server’s GET /logged-session method (see section 6), followed by requests to GET /logged-result to restore all the results. The obtained information and results are inserted into the editor page. Permalinks restore the session with the content they had when they ter- minated, including the results, but not the options. This is an untypical design choice, as usually permalinks do not restore the output. The reason why Tuwin does it is the assumption that, when users access permalinks, they are interested in the output and not just the input, so they will click the “Run” button anyway. By restoring the results automatically, the users can access them faster. The downside of this approach is that the restored results may be outdated because the tool might have been updated in the meantime. This is accounted for by displaying a warning message to the user

22 as well as the date and time and the tool’s version number at the time of the session’s original execution.

5.4 Error handling There are a number of situations where errors can occur in the Tuwin system: network failures, wrong configurations, a user not obeying the protocol, just to list a few. Tuwin differentiates between access errors which occur when a user makes a request that is not allowed by the protocol, config errors which occur when a component is not correctly configured, network errors which occur when communication fails between the server and a tool hoster, and UI errors which occur when the UI fails to communicate with the server. Config errors should be easy to fix by an administrator of the server or tool hoster, respectively, while access errors can only be fixed by the client. Since all components of the Tuwin system conform to the protocol no access error should ever occur when Tuwin is used correctly. On the tool hoster and the server, errors are handled in a similar way. Whenever an error occurs, the execution of the current HTTP method is aborted and an error message is produced. This message is returned in the form of a JSON object instead of the method’s regular response. To inform the requester that the response is an error message the response’s HTTP status code is changed, conforming to the HTTP/1.1 status code definitions [20]. In the case of an access error the status code 400 Bad Request is used and in any other case a 500 Internal Server Error is returned.

5.4.1 Error messages and classes Each error message contains an error code. This code uniquely identifies the error’s type. Furthermore, an error message consists of a name, a short description of the error and sometimes a list of details about the specific error. Unless they occur in the GUI servlet, error messages are returned as JSON objects because in this form they are easily readable both by humans and the Tuwin server for logging. Listing 3 shows an example of an error, in this case an instance of error 201 which occurs when a request is made that does not include all required parameters.

{ "class":"Tool Hoster Access Error", "code":201, "name":"Missing Parameter", "description":"A necessary parameter was not provided.", "details":{

23 "missing parameter":"input" } } Listing 3: Example of an error message encoded as a JSON object The errors are classified into 6 error classes, depending on where they occured and who is responsible for them. The classes are listed below. This classification serves no purpose within the Tuwin system and is only used to make it easier for users and administrators to determine the error’s cause and find a solution faster. In the same spirit the first digit of the error code indicates which class the error belongs to.

1. Tool Hoster Config Errors occur on the tool hoster and indicate wrong configuration of the tool hoster or problems with the tool hoster’s tool or operating system.

2. Tool Hoster Access Errors occur on the tool hoster when a request is made that doesn’t meet the requirements of the protocol.

3. Server Config Errors occur on the server and indicate wrong config- uration of the server, its database or operating system.

4. Server Access Errors occur on the server when a request is made that doesn’t meet the requirements of the protocol.

5. Network Errors occur when the server fails to communicate with a tool hoster for a particular reason.

6. UI Errors occur in the Javascript code on the user’s machine. Errors of this class occur when there are network problems between the user’s machine and the server, detected by HTTP timeouts or error responses.

See section B in the appendix for a complete list of all error messages.

5.4.2 Logging and displaying errors Error messages are important for the administrators of the system to tell them what went wrong and how it can be fixed. Therefore, errors are logged whenever possible. On the tool hoster, error messages are logged in a log file. On the server each error message is logged in the database along with a stacktrace of the exception that caused it, if any. This includes errors that were received as part of a request from a tool hoster. Database errors which obviously cannot be logged in the database are logged in the servlet container’s log instead. Errors that occur in the UI are not logged.

24 The UI parses error messages returned by the server and displays them to the user as a pop-up dialog (see figure 4). For security reasons, the server handles error messages slightly differently than the tool hoster. Access errors are handled the same way and are sim- ply returned as the response. Any other error, however, is replaced by an anonymous error of the same class. This is to hide away system details from the users which they do not need to know. E.g., when the tool hoster fails and returns an error message explaining that the tool was not found at “C:\tool.exe”, this information is important for the tool hoster’s admin- istrator and may also be of interest to the server administrator. The users, however, do not need to know that the tool hoster’s tool is located at “C:\” and is called “tool.exe”. All they will see is a generic error message telling them that starting the session failed. The error code of an anonymous error always ends in the digit 0, e.g. 100 would be an anonymous tool hoster config error.

6 Communication Protocol

All communication between the components of the Tuwin system is per- formed using standard HTTP/1.1. Messages are sent as JSON encoded ob- jects, with the exception of result files which are MIME encoded. For sim- plicity and to facilitate future maintenance, the server and the tool hoster provide a very similar HTTP interface and the protocol used between them is almost the same as the one used between the server and the UI. This use of standard technology and the simple protocol makes it easy for a developer to add his/her own implementation of a tool hoster to the system, if need be. The following list contains all HTTP methods that belong to the HTTP interfaces of the tool hoster and the Tuwin server. When using these methods on the server, the tool which they should be applied to has to be specified in the URL by prepending “/comm/tool” to the path where tool needs to be replaced by the tool’s tag. E.g. to access the examples of the tool “Foo” one would use the URL “http://tuwin-server.com/comm/foo/examples”. If not specified, the response is JSON encoded.

• GET / Expected parameters: None Returns basic information about a tool, i.e., a subset of the parameters specified in the tool hoster’s configuration file (see section 5.2.1). This includes at least the tool’s name, a short description and if available a

25 version number, the author’s name, and the mode of the language used for syntax highlighting. There may also be a Boolean value indicating whether to show or hide the tool on the tool selection page.

• GET /examples Expected parameters: None Returns a list of examples that are available for the tool. Each example consists of a name and its content. See section 5.2.3.

• GET /options Expected parameters: None Returns a list of options that are available for the tool. Each option has a name, a type and a tag. If available there is a default value and in the case of a list option, there is a list of values. See section 5.2.4.

• GET /language Expected parameters: None Encoding: plain text (Javascript) Returns the tool’s language file if there is one, otherwise an error is returned. A language file is available if and only if the basic information in the GET / response contains a “language” field.

• POST /run Expected parameters: input, a parameter option tag for each option, where tag is the option’s tag Starts a new session with the provided input text and option values. Returns a unique session key which has to be sent as a parameter in future requests involving the session.

• GET /run-status Expected parameters: key Returns the run-status of a session. It includes the status of the session, which is either “running”, “finished”, “aborted”, “failed” or “timeout”, as well as a list of warning messages and a list of result files that are available. Depending on the status, it includes the tool’s running time and its exit value. If the status equals “failed” it includes an error message explaining why the session failed. See section 4 to see how the run-status is used.

• GET /result Expected parameters: key, result id (optional) Encoding: depends on result type Returns the requested result file.

26 • POST /abort Expected parameters: key Aborts a running session.

• GET /logged-session Expected parameters: key Only implemented on the server, not available on the tool hoster. Returns information about a session that should be restored through a permalink. This information includes the input text, the time the session was started, a list of results that are available, a list of option values and if available the tool’s running time, exit value and the tool’s version number at the time of the execution.

• GET /logged-result1 Expected parameters: key, result id (optional) Encoding: depends on result type Only implemented on the server, not available on the tool hoster. Returns the requested result file belonging to a session that should be restored through a permalink.

• GET2 /reload Expected parameters: secret (server only) This is not used by any component of the Tuwin system but should be used by the tool hosters’ administrators to update their tools’ in- formation. They should use this whenever they made changes to the configuration file, first on the tool hoster, which forces it to reload the file, then on the server, which will fetch the new information from the tool hoster. It is important to note that on the server the user has to provide a secret, i.e., a passphrase which he/she obtained from the server’s administrator as part of the reload URL when the tool was added.

1The GET /logged-result method could be used instead of the GET /result method, but it is slower because it involves a query to the database. Doing it the other way round, i.e., calling the GET /result method instead of the GET /logged-result method, will typically cause a session key expiration error. 2One might ask why this method uses GET instead of POST. The reason is that, since it is only ever used by humans, it is more convenient to have it as a GET method, because this way the method can be called by simply opening the URL in a browser.

27 7 Quality Assurance

One of the goals of this Bachelor’s thesis was to develop a product of high quality. This section discusses how this goal was achieved.

7.1 Project management The project management was done using SBT (simple build tool) [21], which manages all of the project’s dependencies and allows to easily compile it and run its test cases. The Javascript libraries were added manually and thus have to be updated manually if need be.

7.2 Documentation All components of Tuwin are documented in detail. ScalaDocs [22] are avail- able for both the server and the tool hoster and the source code contains many useful comments. The documentation for the Javascript code which is part of the server was generated by jsdoc-toolkit [23], using markup similar to ScalaDoc. There is also documentation in the form of an HTML file, one for the server and one for the tool hoster. It contains detailed information for the administrators about setup, configuration and troubleshooting.

7.3 Testing 7.3.1 Unit tests A number of unit tests have been performed to test the HTTP interfaces of the server and the tool hoster. They are written using Scalatra’s integrated test framework, which builds on the ScalaTest framework [24]. In order for them to produce meaningful results, the server or the tool hoster, respectively, need to be configured properly and some of the parameters in the test cases need to be adapted, e.g. the path to the configuration file.

7.3.2 Stress tests The Open Source tool Gatling [25] was used to test the server and the tool hoster under increased workload. Gatling allows to simulate a normal user trying out a tool in Tuwin by mimicking the HTTP requests the user would evoke. Such a simulation is intended to be used as a stress test to examine how a web application behaves under increased workload, i.e., when many

28 users make requests in a short amount of time. In the case of Tuwin, how- ever, the limiting factor is not the web application but the tools themselves. Each tool hoster has a timeout value (see “timeout” in table 1) after which a running tool is automatically stopped if it did not finish. This is to en- sure that tools don’t run indefinitely due to crashes or bad user input. The worst case scenario is given when multiple users start multiple sessions with the same tool and the processes running in parallel slow each other down so much that all of them time out and are stopped, possibly before producing any results. To avoid this problem the tool hoster has a built-in maximum number of sessions it allows (see “concurrency limit” in table 1). Once this number is reached it rejects any further execution requests until a session terminates. Because of this limitation no large-scale stress tests were per- formed but instead four tests with the properties listed below were made, the first two (tests A and B) connecting directly to the tool hoster and the last two (tests C and D) using the server as the mediator.

• Tool: Chalice, concurrency limit of 15, timeout of 3 minutes • Tool hoster: Windows Server 2008, 4GB RAM, Intel(R) Xeon(R) CPU E7-4870 @2.40GHz 2.39 GHz (2 processors) • Server: same as tool hoster (only used in tests C and D) • Connection: 5 MB/s down, 0.5 MB/s up, ∼24ms RTT • Input: “Cell” example (2.85 kB, included in Chalice)

Tests A and C were used to simulate normal workload, i.e., 10 session requests over the course of 20 seconds, resulting in one new user every two seconds. Tests B and D tested heavy workload with 20 session requests within 10 seconds. Gatling produces plots in the form of an HTML file, which have to be interpreted manually. Part of the result plots of our tests are shown in figure 7. The mean response time of all four tests was below 700 ms with a standard deviation below 2.7 seconds, which is acceptable for HTTP communication, and all test sessions completed within the timeout and thus finished successfully, both under normal and under increased workload. The test results are not discussed in detail here because the interesting part are not the numbers but the fact that neither the server nor the tool hoster collapsed under increased workload. One thing we would like to point out are the drops, from 20 sessions down to 15, which can be seen in the plots of tests B and D. They are caused by the aforementioned concurrency limit, i.e., the first 15 session are started normally but the last 5 requests are rejected and do not start new sessions.

29 Figure 7: Result plots of the four Gatling simulations

Because the Gatling simulations encompass most of the Tuwin protocol, they can also be used as additional unit tests, and they can serve as blueprints for future stress tests.

30 7.3.3 UI compatibility The UI web pages were tested in all major browsers, including Firefox 13.0.1, Chrome 19.0.1084.56, 5.1.4, Opera 11.62, Internet Explorer 9 and 6 and the default Android Browser on a Samsung Galaxy Tab 2. Apart from some minor optical differences the behavior was the same and all features were fully usable. Javascript had to be enabled of course. However, we noticed some problems with the CodeMirror plugin. Nor- mally everything workes fine, but in some rare cases strange behavior was observed, e.g., selecting text did not work properly or the scroll bar could not be clicked or was not visible. These issues occured in different browsers, especially when zooming was used, but could not be reproduced systemati- cally. This seems to be an open issue of CodeMirror and not of Tuwin. Once a new version of CodeMirror is available where the bug is fixed the version used in Tuwin should be updated.

7.4 Security Security was not a key part of the project, but was considered in the design of the interfaces. The contents of a session, i.e., input and result files, are usually not secret, i.e., neither input nor output contains personal or confidental information, but access to sessions is nevertheless protected through the use of session keys. They are generated uniformly at random to make it hard for a third person, not knowing the session key, to manipulate the session, e.g., to abort it prematurely. The admin page and the server’s reload URLs use secret passphrases to control who has access to them. All of these passphrases are stored as MD5 hashes. By allowing only text input and by introducing the size limit for result files, we made sure that it is not possible for users to misuse Tuwin as a data storage service, e.g., by sending arbitrary binary data as input to a tool and restore it through a permalink later.

8 Conclusions

At the time of this writing Tuwin has been up and running for two months on Boogiebox2, a server of the Chair of Programming Methodology at ETHZ. During this time the system has been tried out by some people of the group. Unfortunately we haven’t received much feedback, but we interpret that as no news being good news.

31 The Tuwin system works and is easy to use. All interfaces use standard technology only like HTTP, JSON or JDBC and thus make Tuwin extendable and easy to maintain. The configuration file enables the tool hoster to be used in a generic and flexible way while the admin page and the GET /reload method allow a developer to quickly and seamlessly deploy a new version of his/her tool without having to restart any component of the system. Because the server and the tool hoster are both web applications and the UI runs in the browser, there are hardly any OS restrictions. Scalability has been addressed by the use of timeouts and session limits and security has been addressed by various techniques such as passphrases to the admin page, file size limits and random session keys. Future changes to the project can easily be made because the server’s and the tool hoster’s source code is available in Scala, with the exception of the client-side scripting which is written in Javascript. All code is tested and well documented.

8.1 Future Work There are a number of open issues and limitations that should be addressed in future work. The most pressing one is that in the current version of Tuwin a high level of trust is required in the tool hosters because with the current use of language files the tool hoster administrators have the possibility to execute arbitrary Javascript code in a client’s browser. This may of course be a huge security hole. In a future version of Tuwin this issue could be solved by only allowing language files provided by the server, or a tool hoster’s administrator has to provide the language file at the beginning when his/her tool is added, and after that he/she cannot change it without the permission of the server’s administrator. As mentioned in section 7.3.3, another issue is with the CodeMirror plugin which does not properly work in all browsers. It should definitely be updated once a new version is available. Alternatively a button could be added to deactivate the plugin. The problem of orphan processes when aborting a running process dis- cussed in section 5.2 is of low priority, but must be addressed if a tool hoster is to be deployed on a non-Windows system. Naturally, future versions of Tuwin could also include new features. The most useful extension in our opinion would be to add more functionality to the server’s admin page. This could include features to view error and session logs or to temporarily hide registered tool hosters. All of these tasks

32 can currently only be done by querying the database directly or by modifing the server configuration file. A useful addition would also be to allow a user to contact the server ad- ministrator or the tool developers in case he/she has questions or encounters problems with a tool. For this, a web form could be provided which automat- ically includes details about the user’s operating system and browser, and a permalink to the session to facilitate debugging. Another possible extension would be to allow staged or chained tool exe- cutions, i.e., the output from one tool could be used as the input to another tool, e.g. when the first tool is a contract inference tool which annotates the input code with contracts, and the second tool is a verifier. In the current version, the workload distribution for a tool on the server is done uniformly at random over the set of all available tool hoster URLs (see table 5). In future versions, it could be extended to also take into account the current number of active sessions per tool hoster. Finally, a convenient addition not for the users but the developers would be to add an sbt plugin which allows to run the Gatling simulations with an sbt command instead of having to start Gatling manually.

8.2 Acknowledgments I would like to thank my supervisor Malte Schwerhoff for his assistance and the many useful discussions throughout the course of this project. I would also like to thank Prof. Dr. Peter M¨uller for giving me the opportunity to work on an exciting software development project. Furthermore a big “Thank you” to Jesse Eichar for his “Daily Scala” blog [26], a lot of people at “Stack Overflow” [27] and Dustin Withers at 7sudos [28] for his tutorial on C3PO and Squeryl. Their answers helped me save hours of tiresome work. Finally I wish to thank Jolanda M¨ullerfor her motivating support and for trying to help me find a name for the project.

33 References

[1] rise4fun, Microsoft Research, rise4fun.com

[2] JSON, json.org

[3] Scala Programming Language, EPFL, scala-lang.org

[4] Scalate: Scala Template Engine, scalate.fusesource.org

[5] Scaml, scalate.fusesource.org/documentation/scaml-reference.html

[6] jQuery UI, jqueryui.com

[7] jQuery, .com

[8] CodeMirror, Marijn Haverbeke, .net

[9] Scalatra Web Framework, scalatra.org

[10] Java servlets, oracle.com/technetwork/java/index-jsp-135475.html

[11] Java Process library, docs.oracle.com/javase/1.4.2/docs/api/java/lang/Process.html

[12] Winp Library, java.net/projects/winp

[13] Chalice, research.microsoft.com/en-us/projects/chalice

[14] Dispatch Library, dispatch.databinder.net

[15] lift-json Library, .com/lift/framework/tree/master/core/json

[16] C3PO Library, mchange.com/projects/c3p0/index.html

[17] JDBC, oracle.com/technetwork/java/javase/jdbc/

[18] Squeryl Library, squeryl.org

[19] MySQL Open Source Database, mysql.com

[20] HTTP/1.1 Status Code Definitions, w3.org/Protocols/rfc2616/rfc2616- sec10.html

[21] SBT, scala-sbt.org

[22] ScalaDoc, docs.scala-lang.org/style/scaladoc.html

[23] jsdoc-toolkit, code.google.com/p/jsdoc-toolkit

34 [24] ScalaTest testing framework, scalatest.org

[25] Gatling Project, gatling-tool.org

[26] Daily Scala, Jesse Eichar, daily-scala.blogspot.ch

[27] Stack Overflow, stackoverflow.com/questions/tagged/scala

[28] 7sudos, Tutorial on Squeryl and C3PO, 7sudos.com/blog/scalatra- mysql-and-star-wars-droids

35 A Database tables

id The session’s unique id.

tool The tag of the tool the session belongs to.

session key The session’s key.

input The input text from the user for the session.

start time The time when the session was started.

duration The running time of the session’s tool in milliseconds, or NULL if not available.

status The session’s status, encoded as an integer3, i.e., 0 for “run- ning”, 1 for “finished”, 2 for “aborted”, 3 for “failed” or 4 for “timeout”.

exit value The exit value returned from the session’s tool, set only if the session’s status equals “finished”, NULL otherwise.

tool version The version number of the tool at the time of the execution, if available.

Table 6: Database table “sessions”

3One reason why this is an integer is that there is no SQL standard for enumeration types. The other reason is that integers are used by the Squeryl library to implement enumerations.

36 id The session option’s unique id. session id The id of the session the option value belongs to. tag The option’s tag. name The option’s name. value The option’s value.

Table 7: Database table “session options”

id The result’s unique id. session id The id of the session the result belongs to. result id The result’s id within the scope of the session. result type The result’s MIME type, e.g., “text/plain”, “text/tuwin- table”, “image/jpeg”, . . .

file name The result’s file name under which it is stored in the session directory. name The name of the result as it is displayed to the user.

Table 8: Database table “results”

37 id The error message’s unique id. tool The tag of the tool this error belongs to, or NULL if not applicable. error code The error code which uniquely identifies the error’s type. detail Detailed information about the specific error message, or NULL if not available. stack trace The printed stacktrace of the exception that caused the error, or NULL if not applicable. time The time when the error occured.

Table 9: Database table “errors”

38 B List of Error Codes

This section contains lists of all errors that can occur within the Tuwin system.

Code Description 100 Tool Hoster Error A generic, so-called anonymous error. It is a replacement for a tool hoster config error and its purpose is to hide system details from the user.

101 Unknown Tool Hoster Configuration Error Indicates a bug or a problem with the tool hoster’s operating sys- tem, e.g. when the disk is full. The error message’s details contain the information about the exception that caused the error.

102 Configuration Path Not Set Occurs when the parameter named toolConfigFile was not added to the servlet context. It should contain the path to the configuration file.

103 Configuration File Not Found Occurs when the tool hoster cannot find the configuration at the location specified by the toolConfigFile parameter in the servlet context.

104 Malformed Configuration File Occurs when there is a syntax error in the JSON structure in the configuration file.

105 Parameter Missing In Configuration File Occurs when a required parameter is missing in the configuration file. See the error message’s details and the tool hoster’s documen- tation for more information.

106 Example File Not Found Occurs when an example file was not found at the location specified in the configuration file.

39 107 Language File Not Found Occurs when the language file specified in the configuration file was not found at the location specified. It has to be a Javascript file and the .js extention has to be included in the path.

108 Result List Empty Occurs when the list of results in the configuration file is empty. It needs to have at least one entry since the first entry is used to store the output from the tool.

109 Illegal Default List Index Occurs when an option of type “list” was assigned an illegal value for its “default” parameter in the configuration file. A list option’s default value has to be the index (zero based) of an entry from the list of values of this option.

110 Command Error Occurs when the tool hoster fails to start the tool or command. This may be due to a badly configured “command” parameter in the configuration file, the tool does not exist or the tool hoster does not have the user rights to execute it.

111 Session Creation Failed Occurs when the tool hoster tries to create a new session directory but the directory already exists. This error is highly unlikely and may indicate a bug or a problem with the operating system.

Table 10: Tool Hoster Config Errors

40 Code Description 201 Missing Parameter Occurs when a user makes a request to the tool hoster but does not include all required parameters. The error message’s details contain information about what parameter is missing.

202 Session Not Found Occurs when a user tries to access a session that does not exist. This may be because he/she mistyped the session key or uses a session key from a different tool hoster.

203 Session Expired Occurs when a user makes a request to the tool hoster but provides an expired session key. A session key expires 6 minutes after the session’s tool finished.

204 Illegal Key Occurs when a user makes a request to the tool hoster but provides an illegal session key.

205 Result Not Available Occurs when a user requests a result file from the tool hoster which is not (yet) available. The user can avoid this error by only ever requesting the results that are listed in the response to the request to GET /run-status.

206 Result Too Large Occurs when a user requests a result file from the tool hoster which is larger than the maximum allowed file size. The user can avoid this error by only ever requesting the results that are listed in the response to the request to GET /run-status. For each result file that is too large a warning message is included in that run-status.

207 Unknown Method Occurs when a user tries to access a HTTP method on the tool hoster that does not exist.

208 Illegal Option Value Occurs when one of a user’s request’s option parameters contain a value which is not allowed by that option’s type (e.g. a string to an integer option).

41 209 Tool Hoster Busy Occurs when the tool hoster has reached the maximum number of concurrently running sessions and refuses to accept new requests. The user should wait a few minutes before repeating the request.

210 Language File Not Available Occurs when a user tries to access the tool hoster’s language file but the tool hoster does not use and therefore does not provide one.

Table 11: Tool Hoster Access Errors

42 Code Description 300 Server Error A generic, so-called anonymous error. It is a replacement for a server config error and its purpose is to hide system details from the user.

301 Unknown Server Configuration Error Indicates a bug or a problem with the server’s operating system, e.g., when the disk is full. The error message’s details contain information about the exception that caused it.

302 Server Configuration Path Not Set Occurs when the parameter named serverConfigFile was not added to the servlet context. It should contain the path to the server configuration file. The path may be absolute or relative to the root directory of the servlet container.

303 Server Configuration File Not Found Occurs when the server cannot find the server configuration at the location specified by the serverConfigFile parameter in the servlet context.

304 Malformed Server Configuration File Occurs when there is a syntax error in the JSON structure in the server configuration file.

305 Parameter Missing In Server Configuration File Occurs when a required parameter is missing in the server config- uration file. See the error message’s details and the server’s docu- mentation for more information.

306 Unsupported Database Type Occurs when an unsupported value is given for the db type pa- rameter in the server configuration file. This parameter indicates what type of database is used. The only database type currently supported by Tuwin is “mysql”.

307 Database Connection Failed Occurs when the server fails to connect to the database. The rea- son for this error could be bad access information (e.g., wrong pass- word) or the database is not reachable or not running.

43 308 Query Failed Occurs when a query to the database fails. This may occur when there is a connection problem with the database (e.g., timeouts), when the database is not properly configured, or it may indicate a bug. The error message’s details contain information about the query and the values used in it.

309 Result File Not Found Occurs when a result file was moved or deleted on the server before it could be sent to the user, or there may be problems with the server’s file system.

310 Logged Result File Not Found Similar to error 309. Occurs when the server tries to restore a session (through a permalink) but a result file was moved or deleted and thus cannot be returned.

311 Session Directory Not Found Similar to error 310. Occurs when the server tries to access the session directory containing the result files, but it was moved or deleted.

312 Session Creation Failed Occurs when creating the session directory fails because it already exists. Indicates a bug or a problem with the file system.

Table 12: Server Config Errors

44 Code Description 401 Missing Parameter Occurs when a user makes a request to the server but does not include all required parameters. The error message’s details contain information about what parameter is missing.

402 Illegal Parameter Value Occurs when a user makes a request to the server but provided an illegal value in one of the request’s parameters (e.g. a string instead of an integer). The error message’s details contain information about what parameter is wrong.

403 Session Not Found Occurs when a user tries to access a session that does not exist. The reason for this may be a wrong session key or the session was deleted on the server.

404 Tool Not Found Occurs when a user tries to use a tool that does not exist on the server. The reason for this may be a misspelled tool tag in the URL.

405 Page Not Found Occurs when a user tries to access a page (in the GUI servlet) that does not exist.

406 Unknown Method Occurs when a user tries to access a HTTP method (in the com- munication servlet) that does not exist.

407 Logged Session Not Found Occurs when the logged session that should be restored through a permalink does not exist. It may have been deleted by the server administrator or the permalink URL is wrong.

408 Logged Result Not Found Occurs when the logged session that should be restored through a permalink does not have a result with the result id that was provided with the request. Only the result ids listed in the response from the request to the GET /logged-session method are allowed.

45 409 Illegal Session Key Occurs when an illegal session key is provided in a request.

410 Language File Not Available Occurs when a user tries to access a tool’s language file, but the tool does not use and therefore does not provide one.

411 Wrong Secret Occurs when a user tries to access a tool’s GET /reload method but does not provide the correct secret. The secret is a string that was randomly generated when the tool was registered at the server. It should only be known to the tool hoster’s administrator and allows him/her to use the GET /reload method to update the information the server stores about the tool.

Table 13: Server Access Errors

46 Code Description 500 Network Error A generic, so-called anonymous error. It is a replacement for a network error and its purpose is to hide system details from the user.

501 Unknown Network Error Occurs when there is an unknown problem between the server and a tool hoster. There may be various reasons for this, e.g., closed ports or high network latency. The exception stacktrace holds detailed information.

502 Tool Hoster Not Found Occurs when the server fails to find a tool hoster. The reason may be an incorrectly set URL, badly configured firewalls or the tool hoster is not running.

503 Unexpected Tool Hoster Error Occurs when the server fails to parse an error message sent by the tool hoster. This is most likely caused by a crash on the tool hoster’s machine. The error message’s details contain the error message sent by the tool hoster.

504 Protocol Error Occurs when a response sent by a tool hoster does not conform to the protocol. This should not happen when the Tuwin tool hoster is used but may happen when a tool developer uses his own implementation.

505 Unexpected HTTP Code Occurs when the server receives an error response from a tool hoster with an HTTP status code other than 400 or 500, which indicates that the tool hoster crashed or encountered an unknown problem with the servlet container or the operating system. The error mes- sage’s details contain the entire response body.

47 506 Maximum Poll Count Exceeded Occurs when the server exceeds the maximum number of poll at- tempts to the tool hoster’s run-status. This limit serves as a backup measure in addition to the regular tool execution timeout to avoid that sessions never terminate. This error may occur when the tool hoster’s timeout is set too high (larger than 30 minutes) or when a proxy between the server and the tool hoster caches the run-status response.

Table 14: Network Errors

Code Description 600 Unknown Error Occurs when the UI receives an empty (error) response from the server. This may be due to a server crash, network latency or high workload.

601 Failed To Start Session Occurs when the UI tries to start a new session with the server but receives no response. Similar to error 600, this may be due to a server crash, network latency or high workload.

602 Failed To Get Status Similar to error 601. Occurs when the UI tries to get the run-status of a session but receives no response.

603 Failed To Get Result Similar to error 601. Occurs when the UI tries to get the result of a session but receives no response.

604 Failed To Load Session Similar to error 601. Occurs when the UI tries to get information about a logged session but receives no response.

605 Failed To Load Session Result Similar to error 601. Occurs when the UI tries to get the result of a logged session but receives no response.

Table 15: UI Errors

48