Development of Benchmarking Suite for Various Ruby Implementations
Total Page:16
File Type:pdf, Size:1020Kb
MASARYKOVA UNIVERZITA FAKULTA}w¡¢£¤¥¦§¨ INFORMATIKY !"#$%&'()+,-./012345<yA| Development of benchmarking suite for various Ruby implementations BACHELOR THESIS Richard Ludvigh Brno, Spring 2015 Declaration Hereby I declare, that this paper is my original authorial work, which I have worked out by my own. All sources, references and literature used or excerpted during elaboration of this work are properly cited and listed in complete reference to the due source. Richard Ludvigh Advisor: RNDr. Adam Rambousek ii Acknowledgement I would like to thank my supervisor RNDr. Adam Rambousek who has supported me throughout my thesis. I would also like to thank my advisor Ing. Václav Tunka for advice and consultation regarding technical aspects of work as well as guidance and feedback through- out my thesis. Access to the CERIT-SC computing and storage facilities provided under the programme Center CERIT Scientific Cloud, part of the Op- erational Program Research and Development for Innovations, reg. no. CZ. 1.05/3.2.00/08.0144, is greatly appreciated. iii Abstract Ruby is a modern, dynamic, pure object-oriented programming lan- guage. It has multiple implementations which provide different per- formance. The aim of this bachelor thesis is to provide a container- based tool to benchmark and monitor those performance differences inside an isolated environment. iv Keywords Ruby, JRuby, Rubinius, MRI, Docker, benchmarking, performance v Contents 1 Introduction ............................3 2 State of the art ...........................4 3 Ruby performance ........................6 3.1 Introduction to Ruby ....................6 3.2 Ruby implementations ...................7 3.2.1 MRI or CRuby . .7 3.2.2 JRuby . .7 3.2.3 Rubinius . .8 3.3 Introduction to benchmarking ...............8 3.3.1 Benchmarking suite and framework . .9 4 Requirements and analysis ................... 10 4.1 Requirements ........................ 10 4.2 RVM ............................. 11 4.3 Rbenv ............................. 11 4.4 RVM vs Rbenv ........................ 12 4.5 Docker ............................ 12 4.5.1 Container vs. Virtual Machine . 13 4.5.2 Container vs. Process . 13 4.6 Existing benchmark suites ................. 14 4.6.1 Ruby benchmark suite . 14 4.6.2 Bench9000 . 15 4.6.3 RubyBench . 15 5 Solution and implementation .................. 16 5.1 Docker integration ..................... 17 5.2 Running benchmarks .................... 18 5.2.1 Benchmark code injection . 19 5.2.2 Warm up . 21 5.2.3 Timeout . 22 5.3 Storing and publishing results ............... 22 5.4 Presentation tool ...................... 22 1 CONTENTS 5.4.1 Ruby on Rails . 23 5.4.2 Highcharts . 23 5.4.3 Rubyfy.ME . 23 6 Results ............................... 25 6.1 Environment ......................... 25 6.2 MRI Ruby Compilers .................... 26 6.3 Ruby Implementations ................... 28 6.4 MRI 2.2.0 Incremental garbage collection ........ 31 7 Roadmap .............................. 33 8 Conclusion ............................. 35 2 Chapter 1 Introduction Ruby is a modern, pure object-oriented programming language. It has multiple implementations, but this thesis concentrates on bench- marking three major Ruby implementations: MRI (also called CRuby) the original implementation written in C, JRuby written in Java and Rubinius written in Ruby and C++. These implementations are often compared based on their perfor- mance capabilities. This thesis was not focused on developing new benchmarks for the Ruby language, but to create a suitable and ex- tendable benchmarking tool, that would provide comparable results to determine the differences between these implementations. This tool should be able to take any existing benchmark and run it across different Ruby implementations. The developed benchmarking suite was already described in pa- per Ruby Benchmark Suite using Docker[1]. However the problema- tique and the development process of the benchmarking suite is de- scribed in more detail in this thesis. We discuss the State of the art in the Chapter 2, followed by the description of Ruby langauge and the differences in its three major implementations. To ensure complete isolation of all tested Ruby versions, we used Docker (described in Section 4.5) to bundle each configuration inside a Docker container. In the Chapter 5 the Docker integration and the hierarchy of cre- ated Docker images is described in detail as well as methods used to run and collect data from benchmarks. 3 Chapter 2 State of the art In December 2013 Sam Saffron published a call for official long-run- ning Ruby benchmark1. At that time there were multiple long-term benchmarks like PyPy speed center2 for PyPy Python implementation or Go performance Dashboard3 for Go programming language. When developing a fast software, it is important to know all per- formance issues and improvements of used platforms or program- ming languages. Small performance changes inside core functions of programming language can lead to massive performance changes in the developed software. [2] Finding performance regressions in late phases of development is often more expensive than in early stages. Fixing software prob- lems after the release can be up to 100 times more expensive than finding them in analysis and design stage (relative cost to fix error is displayed below in Figure 2.1) [3]. This is why long-term information about performance issues is important. At the beginning of development of this thesis in November 2014, Ruby still did not have a long-term benchmark. At that time there was just one Ruby benchmarking suite used widely by the com- munity. Ruby Benchmark Suite4 was developed between 2008 and 2013 by Antonio Cangiano. His solution consisted of using a host operating system and already installed Ruby to perform the tests. At that time there were no low-weight virtualization solutions (vir- tual machines consisted of entire guest operating system, bringing 1http://samsaffron.com/archive/2013/12/11/ call-to-action-long-running-ruby-benchmark 2http://speed.pypy.org/ 3http://goperfd.appspot.com/perf 4https://github.com/acangiano/ruby-benchmark-suite 4 2. STATE OF THE ART unwanted overhead during benchmarking) to allow running bench- marks in isolated environments. During development of benchmarking suite described below, Guo Xiang Tan5 presented his own benchmark suite in cooperation with Sam Saffron, that became the official Ruby long-term running bench- mark 6. We discuss the basics of his solution in Chapter 4. Figure 2.1: Relative Cost to Fix Software Errors per Lify Cycle phase. Cited from [4]. 5https://github.com/tgxworld 6http://rubybench.org/ 5 Chapter 3 Ruby performance In this chapter we will introduce the Ruby programming language and its most used implementations. The differences in performance of various Ruby implementations will be discussed followed by the introduction to the benchmarking. 3.1 Introduction to Ruby Ruby is still a very young language. It is an interpreted, object-ori- ented programming language which was designed and released by Jukihiro Macumoto, known as Matz, in 1995. It was designed with Perl and Python capabilities in mind. He described some of his early ideas about the language: “I was talking with my colleague about the possibility of an object- oriented scripting language. I knew Perl (Perl4, not Perl5), but I didn’t like it really, because it had the smell of a toy language (it still has). The object- oriented language seemed very promising. I knew Python then. But I didn’t like it, because I didn’t think it was a true object-oriented language — OO features appeared to be add-on to the language. As a language maniac and OO fan for 15 years, I really wanted a genuine object-oriented, easy-to-use scripting language. I looked for but couldn’t find one. So I decided to make it.“ [5] Ruby was designed as an absolutely pure object-oriented script- ing language, where everything is interpreted as an object, even prim- itive types and the values true, false and nil (nil indicates the absence of value, it is Ruby‘s version of null). Ruby is also suitable for pro- cedural and functional programming styles and it includes powerful metaprogramming capabilities. It is focused on simplicity. Simplicity and pure object-oriented ap- 6 3. RUBY PERFORMANCE proach make it an easy-to-use scripting language. Matz’s guiding philosophy for the design of Ruby is summarized in an oft-quoted remark of his: “Ruby is designed to make programmers happy.“[6] 3.2 Ruby implementations Ruby has many different implementations. In this chapter we will talk about some of the most used ones. We will shortly discuss their advantages, disadvantages and main characteristics. 3.2.1 MRI or CRuby The reference implementation, discussed above, is known as “Matz’s Ruby Interpreter” (MRI) or CRuby (since it is written in C). CRuby does support native threads, which in theory means that we can use threads like Java developers do. The problem is that CRuby uses Global Interpreter Lock (known as GIL) which is meant to protect data integrity, allowing data to be modified only by one thread at a time. GIL allows to create multiple OS level threads, how- ever it does not allow the system to schedule them simultaneously on multiple processors. This is why we cannot achieve (true) con- currency. It is important to mention that GIL makes single threaded programs faster but there are people who still prefer taking data in- tegrity into their own hands. This is where, for example, JRuby and Rubinius come in place for people who need Ruby implementations without GIL. 3.2.2 JRuby JRuby, as the title suggests, is 100% Java implementation of Ruby. This allows developers to run our Ruby applications using Java Vir- tual Machine (JVM), utilizing the JVM’s optimizing just-in-time (JIT) compilers, garbage collectors and concurrent threads, which often means that your Ruby code runs faster and more reliably [7]. It also allows code to interoperate with any other library that is compati- ble with JVM. JRuby is open source software, developed primarily at Red Hat.