MASARYKOVA UNIVERZITA FAKULTA}w¡¢£¤¥¦§¨  INFORMATIKY !"#$%&'()+,-./012345

Development of benchmarking suite for various Ruby implementations

BACHELORTHESIS

Richard Ludvigh

Brno, Spring 2015 Declaration

Hereby I declare, that this paper is my original authorial work, which I have worked out by my own. All sources, references and literature used or excerpted during elaboration of this work are properly cited and listed in complete reference to the due source.

Richard Ludvigh

Advisor: RNDr. Adam Rambousek

ii Acknowledgement

I would like to thank my supervisor RNDr. Adam Rambousek who has supported me throughout my thesis. I would also like to thank my advisor Ing. Václav Tunka for advice and consultation regarding technical aspects of work as well as guidance and feedback through- out my thesis. Access to the CERIT-SC computing and storage facilities provided under the programme Center CERIT Scientific Cloud, part of the Op- erational Program Research and Development for Innovations, reg. no. CZ. 1.05/3.2.00/08.0144, is greatly appreciated.

iii Abstract

Ruby is a modern, dynamic, pure object-oriented programming lan- guage. It has multiple implementations which provide different per- formance. The aim of this bachelor thesis is to provide a container- based tool to benchmark and monitor those performance differences inside an isolated environment.

iv Keywords

Ruby, JRuby, Rubinius, MRI, Docker, benchmarking, performance

v Contents

1 Introduction ...... 3 2 State of the art ...... 4 3 Ruby performance ...... 6 3.1 Introduction to Ruby ...... 6 3.2 Ruby implementations ...... 7 3.2.1 MRI or CRuby ...... 7 3.2.2 JRuby ...... 7 3.2.3 Rubinius ...... 8 3.3 Introduction to benchmarking ...... 8 3.3.1 Benchmarking suite and framework ...... 9 4 Requirements and analysis ...... 10 4.1 Requirements ...... 10 4.2 RVM ...... 11 4.3 Rbenv ...... 11 4.4 RVM vs Rbenv ...... 12 4.5 Docker ...... 12 4.5.1 Container vs. Virtual Machine ...... 13 4.5.2 Container vs. Process ...... 13 4.6 Existing benchmark suites ...... 14 4.6.1 Ruby benchmark suite ...... 14 4.6.2 Bench9000 ...... 15 4.6.3 RubyBench ...... 15 5 Solution and implementation ...... 16 5.1 Docker integration ...... 17 5.2 Running benchmarks ...... 18 5.2.1 Benchmark code injection ...... 19 5.2.2 Warm up ...... 21 5.2.3 Timeout ...... 22 5.3 Storing and publishing results ...... 22 5.4 Presentation tool ...... 22

1 CONTENTS

5.4.1 ...... 23 5.4.2 Highcharts ...... 23 5.4.3 Rubyfy.ME ...... 23 6 Results ...... 25 6.1 Environment ...... 25 6.2 MRI Ruby Compilers ...... 26 6.3 Ruby Implementations ...... 28 6.4 MRI 2.2.0 Incremental garbage collection ...... 31 7 Roadmap ...... 33 8 Conclusion ...... 35

2 Chapter 1 Introduction

Ruby is a modern, pure object-oriented programming language. It has multiple implementations, but this thesis concentrates on bench- marking three major Ruby implementations: MRI (also called CRuby) the original implementation written in C, JRuby written in Java and Rubinius written in Ruby and C++. These implementations are often compared based on their perfor- mance capabilities. This thesis was not focused on developing new benchmarks for the Ruby language, but to create a suitable and ex- tendable benchmarking tool, that would provide comparable results to determine the differences between these implementations. This tool should be able to take any existing benchmark and run it across different Ruby implementations. The developed benchmarking suite was already described in pa- per Ruby Benchmark Suite using Docker[1]. However the problema- tique and the development process of the benchmarking suite is de- scribed in more detail in this thesis. We discuss the State of the art in the Chapter 2, followed by the description of Ruby langauge and the differences in its three major implementations. To ensure complete isolation of all tested Ruby versions, we used Docker (described in Section 4.5) to bundle each configuration inside a Docker container. In the Chapter 5 the Docker integration and the hierarchy of cre- ated Docker images is described in detail as well as methods used to run and collect data from benchmarks.

3 Chapter 2 State of the art

In December 2013 Sam Saffron published a call for official long-run- ning Ruby benchmark1. At that time there were multiple long-term benchmarks like PyPy speed center2 for PyPy Python implementation or Go performance Dashboard3 for Go programming language. When developing a fast software, it is important to know all per- formance issues and improvements of used platforms or program- ming languages. Small performance changes inside core functions of programming language can lead to massive performance changes in the developed software. [2] Finding performance regressions in late phases of development is often more expensive than in early stages. Fixing software prob- lems after the release can be up to 100 times more expensive than finding them in analysis and design stage (relative cost to fix error is displayed below in Figure 2.1) [3]. This is why long-term information about performance issues is important. At the beginning of development of this thesis in November 2014, Ruby still did not have a long-term benchmark. At that time there was just one Ruby benchmarking suite used widely by the com- munity. Ruby Benchmark Suite4 was developed between 2008 and 2013 by Antonio Cangiano. His solution consisted of using a host and already installed Ruby to perform the tests. At that time there were no low-weight virtualization solutions (vir- tual machines consisted of entire guest operating system, bringing

1http://samsaffron.com/archive/2013/12/11/ call-to-action-long-running-ruby-benchmark 2http://speed.pypy.org/ 3http://goperfd.appspot.com/perf 4https://github.com/acangiano/ruby-benchmark-suite

4 2. STATE OF THE ART

unwanted overhead during benchmarking) to allow running bench- marks in isolated environments. During development of benchmarking suite described below, Guo Xiang Tan5 presented his own benchmark suite in cooperation with Sam Saffron, that became the official Ruby long-term running bench- mark 6. We discuss the basics of his solution in Chapter 4.

Figure 2.1: Relative Cost to Fix Software Errors per Lify Cycle phase. Cited from [4].

5https://github.com/tgxworld 6http://rubybench.org/

5 Chapter 3 Ruby performance

In this chapter we will introduce the Ruby programming language and its most used implementations. The differences in performance of various Ruby implementations will be discussed followed by the introduction to the benchmarking.

3.1 Introduction to Ruby

Ruby is still a very young language. It is an interpreted, object-ori- ented programming language which was designed and released by Jukihiro Macumoto, known as Matz, in 1995. It was designed with Perl and Python capabilities in mind. He described some of his early ideas about the language: “I was talking with my colleague about the possibility of an object- oriented scripting language. I knew Perl (Perl4, not Perl5), but I didn’t like it really, because it had the smell of a toy language (it still has). The object- oriented language seemed very promising. I knew Python then. But I didn’t like it, because I didn’t think it was a true object-oriented language — OO features appeared to be add-on to the language. As a language maniac and OO fan for 15 years, I really wanted a genuine object-oriented, easy-to-use scripting language. I looked for but couldn’t find one. So I decided to make it.“ [5] Ruby was designed as an absolutely pure object-oriented script- ing language, where everything is interpreted as an object, even prim- itive types and the values true, false and nil (nil indicates the absence of value, it is Ruby‘s version of null). Ruby is also suitable for pro- cedural and functional programming styles and it includes powerful metaprogramming capabilities. It is focused on simplicity. Simplicity and pure object-oriented ap-

6 3. RUBYPERFORMANCE proach make it an easy-to-use scripting language. Matz’s guiding philosophy for the design of Ruby is summarized in an oft-quoted remark of his: “Ruby is designed to make programmers happy.“[6]

3.2 Ruby implementations

Ruby has many different implementations. In this chapter we will talk about some of the most used ones. We will shortly discuss their advantages, disadvantages and main characteristics.

3.2.1 MRI or CRuby The reference implementation, discussed above, is known as “Matz’s Ruby Interpreter” (MRI) or CRuby (since it is written in C). CRuby does support native threads, which in theory means that we can use threads like Java developers do. The problem is that CRuby uses Global Interpreter Lock (known as GIL) which is meant to protect data integrity, allowing data to be modified only by one thread at a time. GIL allows to create multiple OS level threads, how- ever it does not allow the system to schedule them simultaneously on multiple processors. This is why we cannot achieve (true) con- currency. It is important to mention that GIL makes single threaded programs faster but there are people who still prefer taking data in- tegrity into their own hands. This is where, for example, JRuby and Rubinius come in place for people who need Ruby implementations without GIL.

3.2.2 JRuby JRuby, as the title suggests, is 100% Java implementation of Ruby. This allows developers to run our Ruby applications using Java Vir- tual Machine (JVM), utilizing the JVM’s optimizing just-in-time (JIT) compilers, garbage collectors and concurrent threads, which often means that your Ruby code runs faster and more reliably [7]. It also allows code to interoperate with any other library that is compati- ble with JVM. JRuby is open source software, developed primarily at Red Hat.

7 3. RUBYPERFORMANCE

It is also important to mention the disadvantages of this imple- mentation. JIT compiler adds a great power and speed to JRuby how- ever at the cost of loading, verifying and linking the resulting JVM . This rapidly slows down the execution of really short runs [8].

3.2.3 Rubinius Rubinius is known as “Ruby in Ruby”. Majority (around 60%) of Ru- binius source is written in Ruby, which makes it easy for program- mers to understand how internal methods work without the knowl- edge of other languages like C or Java. Rubinius also includes a so- phisticated virtual machine written in C++. This machine executes your Ruby program and, like JRuby, supports JIT, true concurrency and uses a sophisticated garbage collection algorithm.

3.3 Introduction to benchmarking

As more Ruby versions and various implementations approach, it becomes difficult to compare their performance. This is when bench- marking comes handy. "We define benchmarking as the act of measuring and evaluating com- putational performance, networking protocols, devices and networks, un- der reference conditions, relative to a reference evaluation. The goal of this benchmarking process is to enable fair comparison between different solu- tions, or between subsequent developments of a System Under Test (SUT)." [9] Various Ruby implementations and versions are understood as different solutions in this thesis. There are two most important aspects that should be considered when defining or developing benchmarks:

∙ Comparability is fundamental for any benchmark. It means, that two independently executed benchmarks can be meaning- fully compared to each other.

∙ Repeatability is as important as comparability. Running the same benchmark on the same solution, for example using the

8 3. RUBYPERFORMANCE

identical machine, environment and Ruby version, should re- sult in a (close to) identical result.

It is possible to benchmark various language capabilities like: CPU performance, parallelism and ability for concurrent CPU consump- tion, memory management and heap usage, etc. Benchmarks can be divided in following categories:

∙ Real programs - These benchmarks contains standard opera- tions like input, output and user preferences. Often text pro- cessing programs.

∙ Kernel benchmarks - These benchmarks test the performance of kernel functions. They are abstracted from actual program and results are represented using MFLOPS.

∙ Toy benchmarks - Small benchmarks typically between 10 and 100 lines of code. They produce the output that the user already knows. For example Sieve of Eratosthenes or Quicksort algo- rithm.

∙ Synthetic benchmarks - Using the statistics of all types of op- erations from many application programs, proportion of each operation is measured. This proportion is then used to create synthetic benchmarks.

3.3.1 Benchmarking suite and framework Successfully defining benchmarks, in our case short programs testing the basic Ruby functions, is only a half of work required. There is still need for ability of experiment-friendly execution, easy data manipu- lation and sharing with the research community. This is why bench- marking suites and frameworks are developed. They are responsible for easy benchmark execution on different solutions as well as data processing. [9]

9 Chapter 4 Requirements and analysis

In this chapter we will look on various requirements for developing a benchmarking suite, later available software that is useful when developing a Ruby benchmarking suite will be discussed and other solutions will be presented. When developing benchmarking suite that should be usable on multiple versions or implementations, Ruby environment managers can be very useful. This section describes two most used environ- ment managers and new virtualization tool based on Linux.

4.1 Requirements

The development of this benchmarking suite was driven by multi- ple sides, including community and Red Hat. The author’s personal preferences were taken into account as well. The most important requirements that shaped the project during the development stage were:

∙ Long-term running benchmark - As we already mentioned, Ruby did not have any long-term benchmark. This community requirement was the initial step which started this project.

∙ Memory benchmarking - After the first results focusing mainly on time performance measurement were published, Sam Saf- fron (the author of Call to Action: Long running Ruby benchmark[2]) asked for memory benchmarks as well.

∙ JRuby parallelism performance - Red Hat, as a main contribu- tor to JRuby, is deeply concerned about JRuby and other Ruby implementations parallel performance.

10 4. REQUIREMENTSANDANALYSIS

∙ Datacenter usage - Ruby is often used on virtual servers or data centers. These environments often provide a greater amount of virtual CPUs, it is important to monitor the capabilities of Ruby in this domain as well. This requirement was specified by Red Hat.

∙ Isolation - To avoid the problem of results being affected by shared library or resource, the isolation of various Ruby imple- mentations and versions is very important. The author of this thesis took a heed to ensure isolation of different Ruby imple- mentations and versions.

∙ Extensibility- To make the benchmarking suite widely usable by the community, extensibility is important property. The suite must be able to handle custom defined benchmarks and test user provided Ruby versions.

4.2 RVM

Ruby enVironment (Version) Manager (RVM) is a command line tool which allows you to easily install, manage and work with multiple Ruby environments. It is a community supported and maintained project which was originally started in October of 2007 by Wayne E. Seguin.[10]

4.3 Rbenv

Rbenv is another Ruby environment manager, but more lightweight than RVM. Rbenv intercepts Ruby commands using executables in- jected inside user PATH. Using environment variables, Rbenv determines which Ruby ver- sion has been specified by an application, and then passes commands along to the correct Ruby installation. [11]

11 4. REQUIREMENTSANDANALYSIS 4.4 RVM vs Rbenv

There are some small differences between RVM and Rbenv, which often separate users into two groups. Some of the main differences are described in Table 4.1.

Rbenv RVM Usage Need to be loaded to Need to be loaded as your PATH variable. a script to user shell. Ruby installation Manual Ruby instal- Ruby installation and lation or using sister list of available Ru- gem for Rbenv ruby- bies is included in build. RVM. Gemsets No native support, Native support for bundler is preferred, named gamesets and but installation is bundler. available using ex- tension rbenv-gemset. Other RVM overrides com- mands like cd and gem.

Table 4.1: Differences between RVM and Rbenv.

4.5 Docker

Docker is an open source LinuX Container (LXC) with a high level API providing a lightweight virtualization solution that run any Unix application (process) in isolation.[12] "Docker is a tool that can package an application and its dependencies in a virtual container that can run on any Linux server. This helps enable flexi- bility and portability on where the application can run, whether on premise, public cloud, private cloud, bare metal, etc."[13]

12 4. REQUIREMENTSANDANALYSIS

4.5.1 Container vs. Virtual Machine

When deploying an application using virtual machines, not only the sources of the application are included, but also an entire guest oper- ating system, which may consume tens of GB. Only application and its dependencies are packaged using docker containers. This con- tainer is running in userspace on the host operating system, utilizing benefits of virtual machine but being much more efficient. [14]

Figure 4.1: Docker containers compared to virtual machines.

4.5.2 Container vs. Process

Running process usually requires multiple files from the environ- ment (the host operating system, typically libraries in /etc or /var/lib). This makes the process dependent on the filesystem environment. Containers encapsulate the tradition process also with its depen- dencies and the filesystem environment. This allows the encapsu- lated process to carry its own libraries as well as use own configura- tion, for example when accessing root CA certs on SSL connection. [15]

13 4. REQUIREMENTSANDANALYSIS

Figure 4.2: Docker containers compared to other processes.

4.6 Existing benchmark suites

Other solutions which are available in this problem domain will be presented in this section. The first two solutions were available be- fore the work on this thesis started, while the third solution was re- leased simultaneously.

4.6.1 Ruby benchmark suite The Ruby benchmark suite1, solution provided by Antonio Cangiano was developed between 2008 and 2013. It consists of both micro and macro benchmarks to represent a variety of real common workloads. Users are able to benchmark various Ruby versions and implemen- tations by installing them on their OS and then passing their path to the benchmarking tool. This solution provides support for warm up, although it does not use any OS or container based virtualization.

1https://github.com/acangiano/ruby-benchmark-suite

14 4. REQUIREMENTSANDANALYSIS

[16]

4.6.2 Bench9000 Bench90002 is benchmarking tool developed by the JRuby team. It is deeply focused on JRuby warm up benchmarking, however it pro- vides benchmarks for other implementations too. This benchmark suite uses Rbenv to handle multiple Ruby implementations and ver- sions. Functionality for graphing results is included as well. [17]

4.6.3 RubyBench RubyBench3 is the official result of Sam Saffron’s article published online 4. Guo Xiang Tan with cooperation with Sam Saffron officially published this benchmark during the development of this thesis. This benchmark is focused on testing each commit in official Ruby repository, acting more like continuous integration (CI) solution. It also provides support for older versions of MRI Ruby which are de- fined inside single Docker container.

2https://github.com/jruby/bench9000 3https://github.com/ruby-bench/ruby-bench 4http://rubybench.org/

15 Chapter 5 Solution and implementation

In this chapter, the benchmarking suite developed by the author of this thesis is described. As the official long-term running benchmark for Ruby has been published during development, the main priority of this suite was to cover the most used implementations. This suite was designed for general use, so extensibility is also important. In Figure 5.1 we illustrate the basic idea. The benchmarking suite accepts Ruby versions, benchmarks and user configuration as an in- put. Raw data are produced and stored on the filesystem after the benchmarking process. Using the Presentation service, which pro- cesses the produced raw data, an user friendly interactive interface is provided to display the results and generate graphs. This service was part of the developed software and is later described in Section 5.4. To allow manipulation with different Ruby versions and imple- mentations, Docker images were the best solution, as it provides ben- efits of a virtual machine (like libraries and binaries isolation) while being much more efficient. The suite covers three major Ruby implementations and their re- cent versions, as described in table 5.1. Because the original Ruby (MRI) was written in C, the choice of compiler can have a positive or negative effect on its performance. This is why support for GCC and Clang compilers is present. It is important to mention, that this benchmarking suite was de- veloped using Ruby programming language. This is why minimal Ruby version required to run this suite is 2.0.0. This benchmarking suite handles three main responsibilities: Docker integration, running benchmarks, storing and processing data. The development is then split on stages based on its responsibility.

16 5. SOLUTION AND IMPLEMENTATION

Figure 5.1: A prototype of the developed software.

5.1 Docker integration

Due to the use of docker, we need a part of the system, that will automatically manipulate all containing docker images and created containers. First the suite downloads all provided docker images from Docker Hub, described in config.rb located inside config directory. This config determines which docker images will be downloaded. Also when running benchmarks, only listed Ruby versions will be tested (commenting lines of exact version during test phase can be useful when user does not need to test this version). After successful download, all docker images are validated. This is why the correct configuration syntax (as seen in Figure 5.2) is im- portant. During validation phase, the benchmarking suite runs each provided image and validates the presence of correct Ruby version. Compiler, its version and compilation flags are also validated for

17 5. SOLUTION AND IMPLEMENTATION

Implementation Versions Details JRuby 9.0.0.0.pre1 OpenJDK 64-Bit Server, Virtual machine version 1.7.0_75-b13 JRuby 1.7.12, 1.6.8 OpenJDK 64-Bit Server, Virtual machine version 1.7.0_65 Rubinius 2.2.10, 2.3.0, 2.4.0, 2.4.1 MRI Ruby 1.6.8 - 2.2.0 All compiled using GCC 4.8 -O3 (13 versions) MRI Ruby 2.0.0, 2.1.0, Also compiled with different 2.1.5, 2.2.0 compilers: GCC 4.8, GCC 4.9, Clang 3.3, Clang 3.4, Clang 3.5, all on both -O2 and -O3 flags

Table 5.1: Available Ruby versions and implementations.

MRI Ruby. All Docker images were created and build manually to ensure that building finishes correctly. Then the Docker Hub Automated build was used to automatically build all images from project GitHub repository. The hierarchy of all created images is shown in Figure 5.3. All images start from Ubuntu 14.04. JRuby and Rubinius are ex- tended from ryccoo/rvm, but only JRuby was installed using RVM, since Rubinius had some build problems and it needed to be built manually. MRI implementations (in red ellipses) inherit from ryc- coo/clang or ryccoo/gcc depending on compiler used. Tags in the leafs specify available versions for selected implementation.

5.2 Running benchmarks

After all images are downloaded and validated, the suite is able to proceed to the next step and start executing benchmarks. All bench- marks are located inside subfolders within the benchmarks directory. This hierarchy allows us to track origin of the benchmark or divide benchmarks in groups. The hierarchy is strictly prescribed because the suite behavior depends on it. Using this hierarchy separates bench-

18 5. SOLUTION AND IMPLEMENTATION

c l a s s BaseConfig AVAILABLE_DOCKER_IMAGES = { ’ruby −2.2.0’ => { ’GCC 4 . 8 −O2’ =>’ryccoo/mri −gcc −4.8−o2:2.2.0’, ’GCC 4 . 8 −O3’ =>’ryccoo/mri −gcc −4.8−o3:2.2.0’, ... }, ’ruby −2.1.5’ => { ’GCC 4 . 8 −O3’ =>’ryccoo/mri −gcc −4.8−o3:2.1.5’, }, ...

Figure 5.2: Example of config file containing information about used Ruby versions.

marks into two categories: ∙ Basic benchmarks - All benchmarks placed inside any sub- folder except custom subfolder. Code of these benchmarks is modified and injected before each run, allowing the benchmark- ing suite to gather required data, as described below. By de- fault, official Ruby benchmarks1 are present in ruby-official di- rectory inside the benchmarks folder. ∙ Custom benchmarks - For custom purposes the custom sub- folder was created to hold benchmarks that are not supposed to undergo below described code injection. These benchmarks are executed without any modification and only standard and error output is captured, allowing the benchmarking suite to store any kind of information.

5.2.1 Benchmark code injection Ruby benchmark module2 is used for time measurements. Because some benchmarks use library exit() function, which would terminate the whole program before storing the results, custom exception was made to replace these occurrences.

1https://github.com/ruby/ruby/tree/trunk/benchmark 2http://ruby-doc.org/stdlib-2.0/libdoc/benchmark/rdoc/Benchmark. html

19 5. SOLUTION AND IMPLEMENTATION

Figure 5.3: All docker images and their hierarchy.

Figure 5.4 illustrates the process during the benchmark execution. Before the execution of the actual benchmark code, the injection ser- vice stores the current memory consumption per process using the Unix ps command. Code is then wrapped inside the Benchmark measure block (pro- viding time performance information) and the Ruby block catching all thrown BenchExitException exceptions. This allows us to jump out of the benchmark code, on sections where the original code con- tained exit() function, in order to store the results. At the end, garbage collection is triggered manually and memory consumption is stored again. After all data are gathered, at the end of each run, the suite prints them to error output for further processing. This data consists of memory usage needed and time spent to execute provided code as well as total executable memory. Because Docker isolates containers from the host operating sys- tem, we would not be able to reach the benchmarks inside a con- tainer. This is why we need to share some resources when starting a container. Only benchmarks and results folders are exported from the host. The full file structure is described in Figure 5.5 at the end of this chapter.

20 5. SOLUTION AND IMPLEMENTATION

Figure 5.4: Actions triggered inside injected benchmark.

The benchmark is injected inside the container. After successful run, standard and error output are stored inside results folder (in files stdout and stderr) for further processing, which is done on the host OS using installed and required Ruby. At this moment the benchmarking suite does not allow users to change command line flags when starting a benchmark inside the container, without modifying the source code. It means, that all bench- marks are executed using selected Ruby implementation and version with its default configuration.

5.2.2 Warm up We can separate provided Ruby implementations into two groups:

∙ Interpreters - Implementations consisting only from an inter- preter, in our case MRI Ruby (as it strands for Matz’s Ruby In- terpreter).

∙ JITs - Implementations also containing a Just-in-time compiler. This group consist of JRuby and Rubinius.

Just in time compiler results in a slower execution at program startup. Also, for example, JRuby does not compile Ruby code into

21 5. SOLUTION AND IMPLEMENTATION bytecode until it has been executed a few times [8]. For JITs it is im- portant to warm up the virtual machine by running the code multiple times or for a short period of time before the actual benchmarking. So far, there is no support for this feature. All benchmarks are run without the warm up resulting in slower times for JITs. To enable this feature, benchmarks need to be adapted to run in loops. The author is planning to add support for warm up, as soon as we ensure enough suitable benchmarks.

5.2.3 Timeout

Because this suite was designed to run any user benchmarks, we need to provide a solution for situations when the benchmark is stuck in loops or takes too long to execute. This is why the author of this thesis developed a module that watches the benchmark exe- cution and terminates it after the defined amount of time has passed. This time is set to 5 minutes by default, however the user can override this settings using provided environment variable.

5.3 Storing and publishing results

Each run stores a record (one line) inside a csv file named after Ruby version used for the respective run. Each record contains informa- tion about benchmark executable, Ruby version, compiler and cur- rent time. For successful benchmarks we store data from code injec- tion or full standard and error output for custom benchmarks. If the benchmark exits on an exception or a timeout, an information about unexpected exit is stored with the record, leaving other data blank.

5.4 Presentation tool

To present results online, the author decided to build an online ser- vice that will process the collected data into graphs. Only the used technologies are described in this section, not the development pro- cess since this thesis is aimed at the development of the benchmark- ing suite.

22 5. SOLUTION AND IMPLEMENTATION

Figure 5.5: The project file structure with commentary.

5.4.1 Ruby on Rails Ruby on Rails [18] framework has been chosen because it brings great power to developers, making web development easier and faster. Rails is an open source web application model-view-controller (MVC) framework written in Ruby. It provides default structures for databases, web services and pages. Rails uses well-known software engineering patterns. It assumes there is the “best“ common way for web development and encourages the programmers to follow it. [19]

5.4.2 Highcharts To generate dynamic, user friendly graphs, Highcharts [20] library has been used, which is available free of charge for non-commercial products. Highcharts is pure Javascript library which offers easy way to add charts to web applications and it provides numerous chart types. It is compatible with all modern browsers and devices. [20]

5.4.3 Rubyfy.ME The developed web application was named Rubyfy.ME and made available online 3. It uses RESTful application programming inter-

3http://rubyfy.me

23 5. SOLUTION AND IMPLEMENTATION face (API) to both store and share stored results. The developed bench- marking suite provides functionality to automatically push results to Rubyfy.ME web application on address specified by the user. It evaluates all stored benchmarks and results, and automatically populates comparison in following three categories: MRI Compilers, MRI Versions overview, Ruby implementations comparison.

24 Chapter 6 Results

Using benchmarks from the official Ruby repository enables us to present results in the following categories: ∙ MRI Ruby Compilers - a comparison of Ruby compilers (Clang, GCC) tested on MRI Ruby Versions 2.2.0. ∙ Ruby Implementations - the difference between various im- plementations and between handling single-thread vs. multi- thread tasks. ∙ Ruby 2.2.0 Garbage collection - the progress of new incremental Garbage collection announced in Ruby 2.2.0. Each benchmark was successfully run ten times to provide stable and usable results from both environments. The most recent versions of selected implementations used during development process are de- scribed below in Table 6.1. Implementation The latest version MRI Ruby 2.2.0 Rubinius 2.4.1 JRuby 9.0.0.0.pre1

Table 6.1: The latest versions of used Ruby implementations.

6.1 Environment

We performed benchmarking in two independent environments, to provide results for different usages (baremetal and virtual server).

25 6. RESULTS

∙ Baremetal Ubuntu machine Category Specification Operating system Ubuntu 14.04, kernel version 3.13.0- 36-generic x86_64 GNU/Linux CPU Intel(R) Core(TM) i5-2450M CPU @ 2.50GHz RAM 2x 4GB Samsung SODIMM DDR3 Synchronous 1333 MHz Motherboard Intel Emerald Lake

Table 6.2: Specifications of used baremetal machine.

∙ Virtual private server provided by CERIT-SC

Category Specification Virtual OS Debian 3.16.7-ckt4-3 bpo70+1 Hardware CPU 2x Intel(R) Xeon(R) CPU E5-2620 0 @ 2.00GHz (12 cores) Virtual CPU 8 virtual CPUs Hardware RAM 96 GiB DDR3 1333 MHz Virtual RAM 16 GiB

Table 6.3: Specifications of used CERIT-SC virtual machine.

Center CERIT-SC (CERIT Scientific Cloud) offers computing re- sources and also participates in research and development activities. Each Ruby implementation was run with its default settings. We did not use any runtime command flags as there is no support for this feature yet. The only exception was made during parallelism test, when JRuby and Rubinius was run with JIT compiler disabled.

6.2 MRI Ruby Compilers

As MRI Ruby (also called CRuby) is written in C, the choice of a C compiler and its compilation flags can affect the performance of a

26 6. RESULTS

Ruby interpreter. In December 2014, Peter Wilmott published an article [21] about MRI Ruby compilers. All of his tests were run on AWS from an m3 medium EC2 instance and he used Ruby version 2.1. Using bench- marking suite developed by Antonio Cangiano, his results show a great performance increase from GCC 4.8 to GCC 4.9 and he also pointed out that optimization on level 2 works better than on level 3. Running all of the official Ruby benchmarks on the Ruby version 2.2.0 on both a baremetal and a virtual server provided by CERIT- SC group resulted in our results being different than those obtained by Peter. Both sets (baremetal and CERIT-SC) provided us with the same result (Figure 6.1 and Figure 6.2), thus showing that GCC 4.8 on optimization level 3 (the default shipped for Ubuntu 14.04) is still ahead by a small amount (1-2% faster than GCC 4.9 -O3). The results also provided that level 3 optimization is currently faster for Ruby 2.2.0 (an almost 4% speed increase from GCC 4.8 -O2 to GCC 4.8 - O3).

Figure 6.1: Overall results for MRI compilers run on a baremetal Ubuntu machine.

27 6. RESULTS

Figure 6.2: Overall results for MRI compilers run on CERIT-SC vir- tual server.

6.3 Ruby Implementations

As was mentioned before, MRI Ruby belongs to Interpreters group while JRuby and Rubinius belong to JITs group. Also MRI Ruby uses GIL which often makes single threaded task run faster, however it does not permit concurrent modifications. It is also very important to remark that the benchmarking suite does not allow warm up for JITs at that time. This is why MRI Ruby is significantly better in overall results (official Ruby benchmarks that are used for each category are single thread simple tests and often too short to start a JIT compiler on JRuby and Rubinius, thus making their results even worse) as shown in Figure 6.3 and Figure 6.4

28 6. RESULTS

Figure 6.3: Overall time performance on different Ruby implementa- tions tested on a baremetal Ubuntu machine.

Figure 6.4: Overall time performance on different Ruby implementa- tions tested on a CERIT-SC virtual server.

29 6. RESULTS

Although we used the parallelism benchmark provided in the Rubinius repository1 to determine the differences in handling par- allel tasks. In this benchmark we manually edited the suite to pass command line arguments disabling JIT compilers for JRuby and Ru- binius (-X-C for JRuby and -Xint for Rubinius ) because this bench- mark is built to run multiple times and to provide the best results gathered. This would allow the JIT compilers to activate and radi- cally reduce the execution times. In Figure 6.5 we can see the true power of JRuby parallelism as well as its progression.

Figure 6.5: Rubinius parallelism benchmark - running 4 parallel threads.

The benchmark first computes the amount of work needed to keep the thread busy for two seconds, then runs the same amount of work on each of four threads. Virtualization (the usage of virtual CPUs) makes it harder to compute and calibrate the needed amount of work, we present only results from the baremetal machine.

1https://github.com/rubinius/rubinius-benchmark/blob/master/ parallelism.rb

30 6. RESULTS 6.4 MRI 2.2.0 Incremental garbage collection

The MRI Ruby version 2.2.0 has announced new incremental garbage collection. From this version, symbols are also garbage collectible. Symbols are now divided into two categories: mortal and immortal. Immortal symbols are defined inside code while mortal symbols are created dynamically during execution. MRI Ruby 2.2.0 now collects mortal symbols allowing to free up more memory. We were able to watch the decrease of memory usage compared to the previous ver- sion as shown in Figure 6.6.

Figure 6.6: Memory usage difference from the average.

These tests were also run separately, aimed at the most recent Ruby versions which confirmed the gap between Ruby 2.1.x and 2.2.0 (show in Figure 6.7).

31 6. RESULTS

Figure 6.7: Difference from average memory usage on recent Ruby versions.

32 Chapter 7 Roadmap

This chapter summarizes various important functional requirements that were received during the development phase and the author’s ideas for new features. The Roadmap also includes all missing fea- tures that were mentioned before. To describe the plans or as a lead for the future contributors fol- lowing issues are presented:

∙ Warm up support - Warm up is crucial for JRuby and Rubinius as we mentioned before. This most urgent issue consists of few changes inside the core of the benchmarking suite as well as special benchmarks. To ensure that benchmarks can be used with warm up, we must provide code that is repeatable in one execution (it is impossible to read the full standard input twice in one program execution).

∙ Web error section - This feature is more concerned with the web presentation service. The pure goal is to provide answers for questions like “Why I can’t see the results of this Ruby ver- sion?“ or “What error occurred during the execution of this code?“.

∙ User flags for Ruby commands - Sometimes it is important to change the behavior of Ruby executable by passing command line arguments (like we did in the parallelism benchmark). We would like to give developers the ability to modify the start-up options for every Ruby implementation or version.

∙ Live code testing - This is the hardest and the most challenging issue that is presented. The idea is to create separate web sec- tion that would allow users to benchmark their own code on-

33 7. ROADMAP

line. The solution would require a standalone server connected with the web presentation service. However allowing users to run their own code brings an amount of security risks, like us- ing the benchmarking server for DoS attacks.

The project is released in open source license and all contributions are welcomed.

34 Chapter 8 Conclusion

This thesis described Ruby programming language and its three ma- jor implementations as well as the main differences in their perfor- mance. We defined why benchmarking is important and what is the role of a benchmarking suite. In Chapter 4 the demanded requirements are described, later avail- able software for managing Ruby environments and virtualization was discussed. We explained the advantages of using Docker com- pared to standard virtual machines or processes. The development process of the new benchmarking suite was ex- plained in Chapter 5. The process was split into three responsibilities. Firstly the Docker integration was described. Secondly benchmark- ing process including the methods of running benchmarks and col- lecting required information was discussed. The last responsibility described is data processing and storing the results. Using the official Ruby benchmarks on environments described in Section 6.1 we were able to watch the results shown in Chapter 6 and provide following conclusions:

∙ In case of installing newest version of MRI Ruby (version 2.2.0 during the benchmarking process) for standard uses the choice of GCC 4.8 provides the best available performance. This ver- sion is still used as default for a group of linux distributions, allowing to compile Ruby without any changes.

∙ The new incremental garbage collection announced in MRI 2.2.0 has brought a lot of changes as well. The ability to collect mor- tal symbols provide the solution to often done programming mistake. The problem occurred when the program was con- verting user input (or other retrieved data) to symbols. These

35 8. CONCLUSION

symbols were not garbage collectable before resulting in slowly draining all available system memory. The new GC introduced in MRI 2.2.0 allows to collect these symbols, thus preventing from draining all memory.

∙ We were able to observe the power of JRuby and Rubinius in parallel computations, when the GIL prohibits MRI from running its threads simultaneously. Figure 6.5 proves the po- sition of JRuby in parallel computations. This comes handy when developing an application for distributed solutions or environments offering multiple virtual CPUs. This ability in- crease from version to version. Taking into account the start-up delay and time required to warm up these versions, MRI is really usefull when the pro- gram is required to run single-threaded or when the program- mer is executing small snippets of code. This can be observed in Figures 6.3 and 6.4.

We describe the roadmap with required features and open issues in Chapter 7. For example the developed suite still does not support functions required to provide more accurate data for implementa- tions supporting JIT compilers. The author believes all official requirements for this thesis were fulfilled. The work on this thesis also resulted in paper titled Ruby Benchmark Suite using Docker[1] and third place on the Winter of Code 2015 challenge.

36 Bibliography

[1] LUDVIGH, Richard, Tomáš REBOK, Václav TUNKA and Filip NGUYEN. Ruby Benchmark Suite using Docker, peer review process at FedCSIS. [cit. 2015-05-06].

[2] SAFFRON, Sam. 2013. Call to Action: Long running Ruby benchmark [online]. [cit. 2015-05-06]. Available online on URL: http://samsaffron.com/archive/2013/12/11/ call-to-action-long-running-ruby-benchmark.

[3] STECKLEIN, Jonette M, Jim DABNEY, Brandon DICK, Bill HASKINS, Randy LOVELL and Gregory MORONEY. 2004. Error Cost Escalation Through the Project Life Cycle [online]. In: . [cit. 2015-04-24]. DOI: 20100036670. Available online on URL: http://ntrs.nasa.gov/archive/nasa/casi.ntrs.nasa.gov/ 20100036670.pdf.

[4] BOEHM, Barry W. 1981. Software engineering economics. Englewood Cliffs, N.J.: Prentice-Hall, xxvii, 767 p. ISBN 01-382-2122-7.

[5] MAEDA, Shugo and Dave THOMAS. 2002. The Ruby Language FAQ [online]. [cit. 2014-11-05]. Available online on URL: http://ruby-doc.org/docs/ruby-doc-bundle/FAQ/FAQ.html.

[6] FLANAGAN, David and . 2008. The Ruby programming language. 1st ed. Sebastopol, CA: O’Reilly, xi, 429 p. ISBN 05-965-1617-7.

[7] SHAUGHNESSY, Pat. 2013. Ruby under and microscope: an illustrated guide to Ruby internals. San Francisco: No Starch Press, xxii, 336 pages. ISBN 15-932-7527-7.

37 BIBLIOGRAPHY

[8] Improving startup time [online]. 2014. [cit. 2015-11-05]. Available online on URL: https: //github.com/jruby/jruby/wiki/Improving-startup-time.

[9] BOUCKAERT, Stefan, Jono Vanhie-Van GERWEN, Ingrid MOERMAN, Stephen C PHILLIPS, Jerker WILANDER, Shafqat Ur REHMAN, Walid DABBOUS and Thierry TURLETTI. 2010. Benchmarking computers and computer networks [online]. [cit. 2015-04-23]. Available online on URL: http://www.ict-fire.eu/uploads/media/ Whitepaperonbenchmarking_V2.pdf.

[10] SEGUIN, Wayne E. and Michal PAPIS. 2014. RVM: [online]. [cit. 2015-05-06]. Available online on URL: https://rvm.io/.

[11] STEPHENSON, Sam. 2015. Groom your app’s Ruby environment with rbenv [online]. [cit. 2015-05-01]. Available online on URL: https://github.com/sstephenson/rbenv.

[12] AVRAM, Abel. 2013. Docker: Automated and Consistent Software Deployments [online]. [cit. 2015-05-01]. Available online on URL: http://www.infoq.com/news/2013/03/Docker.

[13] NOYES, Katherine. 2013. Docker: A ’Shipping Container’ for Linux Code [online]. [cit. 2015-05-01]. Available online on URL: http://www.linux.com/news/enterprise/cloud-computing/ 731454-docker-a-shipping-container-for-linux-code.

[14] What Is Docker? An open platform for distributed apps [online]. 2015. [cit. 2015-05-01]. Available online on URL: https://www.docker.com/whatisdocker.

[15] THORSTEN, Eicken. 2014. Docker vs. VMs? Combining Both for Cloud Portability Nirvana [online]. [cit. 2015-05-01]. Available online on URL: http://www.rightscale.com/blog/ cloud-management-best-practices/ docker-vs-vms-combining-both-cloud-portability-nirvana.

38 BIBLIOGRAPHY

[16] CANGIANO, Antonio. 2013. Ruby benchmark suite [online]. [cit. 2015-05-06]. Available online on URL: https://github.com/acangiano/ruby-benchmark-suite.

[17] Bench9000 [online]. 2015. [cit. 2015-05-06]. Available online on URL: https://github.com/jruby/bench9000.

[18] Ruby on Rails [online]. 2015. [cit. 2015-05-02]. Available online on URL: http://rubyonrails.org.

[19] Getting Started with Rails [online]. 2015. [cit. 2015-05-02]. Available online on URL: http://guides.rubyonrails.org/ getting_started.html#what-is-rails-questionmark.

[20] What is Highcharts? [online]. 2015. [cit. 2015-05-02]. Available online on URL: http://www.highcharts.com/products/highcharts.

[21] WILMOTT, Peter. 2014. Benchmarking Ruby with GCC (4.4, 4.7, 4.8, 4.9) and Clang (3.2, 3.3, 3.4, 3.5) [online]. [cit. 2015-05-03]. Available online on URL: https://www.p8952.info/ruby/ 2014/12/12/benchmarking-ruby-with-gcc-and-clang.html.

39 Glossary

API Application programming interface.

CI Continuous integration.

CPU Central processing unit.

DoS Denial of service.

GC Garbage collection.

GCC GNU Compiler Collection.

GIL Global Interpreter Lock.

JIT Just in time.

JVM Java Virtual Machine.

LXC LinuX Container.

MRI Matz’s Ruby Interpreter.

OS Operating system.

RVM Ruby enVironment (Version) Manager.

40