
Bachelor Informatica Benchmarking Akka Dennis Kroeb June 15, 2020 Informatica | Universiteit van Amsterdam Supervisor(s): Ana-Lucia Varbanescu Signed: 2 Abstract In modern-day computing, concurrent programming is essential in high-performance sys- tems. The Akka platform provides a programming model to meet this need. Systematic performance analysis studies for Akka do not exist. Therefore, this thesis proposes such a study. To this end, the performance of the Akka actor model is assessed by first comparing it to Java in a microbenchmarking experiment, which illustrates the overhead and different threading models in Akka. Furthermore, to compare Akka with other models, we ported two applications from Computer Language Benchmarks Game (CLBG) to Akka, and compared their performance against the original CLBG models, using the CPU metrics, compressed code size, and sampled memory usage. Akka performed similar to Java. Based on this anal- ysis, we conclude that Akka can get similar performance to Java in non-blocking concurrent applications, but Akka has a larger code size in general. 3 4 Contents 1 Introduction 7 1.1 Research question and approach . .7 1.2 Ethical aspects . .8 2 Background and related work 9 2.1 Benchmarking and CLBG . .9 2.2 The Akka platform . 10 2.2.1 The Akka actor model . 10 2.2.2 Java, Akka and multithreading . 11 2.3 Related work . 12 3 Akka vs Java: a microbenchmark 15 3.1 Experiments setup . 15 3.2 The Counter program . 15 3.3 Scalability and overhead . 17 3.4 Measuring different phases . 18 3.5 Initialisation of the Akka actor model . 21 3.6 ForkJoinPools and ThreadPoolExecutors . 21 4 CLBG for Akka 25 4.1 Selecting relevant CLBG programs . 25 4.2 Program 1: Binary trees . 26 4.2.1 Porting to Akka from pseudocode . 26 4.2.2 CLBG results - Binary Trees . 29 4.3 Program 2: Reverse Complement . 31 4.3.1 Porting to Akka from Java . 32 4.3.2 CLBG results - Reverse Complement . 35 5 Conclusion and future work 39 5.1 Main findings . 39 5.2 Future work . 40 Acronyms 41 5 6 CHAPTER 1 Introduction The world we live in values (real-time) connectivity more with each day, and the demand for high-performance distributed applications keeps rising. Proper design and implementation is important to ensure good performance, and choosing a suited programming model is important for these applications. Concurrency plays an important role in these distributed applications. The Akka platform claims to perform well in distributed applications, by providing native concurrency through an actor-based model. The platform incorporates the Akka toolkit, which is a collection of modules built by Lightbend, a company specialised in real-time cloud-based services [1]. The toolkit offers support for two common programming languages: Scala and Java. In this project, we only evaluate the Java implementations of Akka, because of our familiarity with the language. The problem is that there is currently no systematic performance analysis study done on Akka compared to other models. Performance analysis of Akka can aid in choosing the right model when designing new applications. This thesis is aimed at solving the mentioned problem. 1.1 Research question and approach To provide a systematic performance of Akka compared to other models, we answer the following research question in this thesis: How does the Akka actor model perform compared to other models? This thesis focuses on assessing the performance of the Akka actor model, which is the core module of Akka. Akka uses the actor model programming principle, and our performance analysis gives insight into the performance impact of this abstraction compared to other programming principles, like regular Object-oriented programming (OOP) in Java for example. To answer our research question we propose two different types of comparison, specifically addressing two sub-questions: SQ1: How does Akka compare against Java for a basic application? SQ2: How does Akka compare against the multiple models in the Computer Language Bench- marks Game? To answer SQ1, we microbenchmark Java multithreading versus Akka actors through a sim- ple synthetic program (Chapter 3). For the second sub-question (SQ2 ), we benchmark Akka versus many other models, using the Computer Language Benchmarks Game (CLBG). We fur- ther report on porting existing input programs included by CLBG to Akka, because CLBG does not thus far support Akka natively. This porting process and the CLBG results can be found in Chapter 4. In Chapter 5, we conclude our report and discuss our findings. 7 If a new model like Akka does not perform much better than other models on existing pro- grams, rewriting an existing program might not be worth the work. It should not be forgotten that there is (currently) no such thing as a 'best' model, because all programming models have different use cases and features which vary in their relevance with respect to a given program. A nice analogy for this: it is very hard to be an excellent sprinter and marathon runner at the same time. 1.2 Ethical aspects This project stems from an application where a distributed system is used to supervise illegal activities (e.g. poaching) in national parks [2]. Because this surveillance software uses Akka, a proper performance analysis could help with catching of poachers if our findings contribute to a better code base. Furthermore, our work can enable users to make informed choices about the tools they use to program their applications, which is beneficial to the efficient use of (computational) resources. This could then in turn lead to less power consumption, which is better for the environment and also offers financial benefits, as long as the functionality is not compromised by the reduced power consumption. The work in this thesis is open-source, which offers transparency and reproducibility. This allows others to perform additional research based of our findings. This project does not touch on controversial topics (like artificial intelligence for example), and we see this work as ethically responsible. 8 CHAPTER 2 Background and related work In this chapter, we aim to provide the basic terms and notions required to understand the research done in this thesis. Thus, we discuss the Akka platform, benchmarking, CLBG, and we briefly present related work. 2.1 Benchmarking and CLBG In the context of this thesis, we define a benchmark as one program that measures execution of an input program to quantify performance; a benchmarking suite is a set of such programs, typically representative for real-life applications, whose combined performance measurements give a better understanding of the performance across different types of applications. The main challenges for any benchmark are the selection of the applications and the selection of the representative metrics. Both selections depend on the goal of the benchmark. In this work we focus on using the Computer Language Benchmarks Game (CLBG) as our benchmark suite. CLBG is a benchmark suite that tests many programmings models using various implemen- tations of algorithms (programs). The suite is well-documented [3] and open-source [4]. The benchmark results are posted on the CLBG website, but the suite also provides the option to display your own benchmark results on a (local) webpage (Figure 2.1). To assess the performance of a model, CLBG uses the following metrics: Execution time, memory usage, compressed code size, total CPU time over all threads and individual thread usage. An overview of these metrics are in Table 2.1 Example measurements can be seen in Figure 2.1. Compressed code size is measured using the GZIP tool [5]. CPU information is gathered using the GTOP library [6]. Memory measured by GTOP as well, and is sampled every 200ms. Programs who run less than 1 second may not have accurate memory measurements as a result of this. [3]. Metric Unit Tool Details Execution time Sec. GTOP 2.0 Measure the whole execution time, from start to finish Total CPU time Sec. GTOP 2.0 Total CPU non-idle time for all cores combined CPU load per core % GTOP 2.0 Amount of non-idle CPU work performed per core with respect to the total time Average peak memory MB GTOP 2.0 Peak RAM usage (sampled every 200ms). Compressed code size B GZIP 1.6 Compressed using minimal GZIP compression Table 2.1: Metrics used by the CLBG suite CLBG is transparent - due to its clear specifications and metrics - and offers a wide range of programs implemented using many models. This enables users to get performance insights while not overfitting to certain models, a mistake made by other Akka benchmarks[7][8]. 9 Figure 2.1: The CLBG WebUI for displaying benchmark results. The number behind some source programs identifies the implementation of the algorithm for that model, as some models have multiple implementations [4]. CLBG obtains its measurements about models in a generic way and does not require a lot of special action for a new model to be supported, assuming the implementation of the input program is valid with respect to the program's specifications. Because of this, CLBG is useful for benchmarking Akka as a new model for the suite. While input programs for CLBG can be extended to support other models, but you should be careful to follow the specifications listed by the suite, to ensure a fair comparison between models. 2.2 The Akka platform The Akka platform is a toolkit which provides concurrency and distributivity in message-driven applications using the actor model programming principle [1]. 2.2.1 The Akka actor model The Akka actor model is the core of the Akka platform, which incorporates the actor model programming principle. The actor model allows concurrent programming with the advantage of enforcing encapsulation without locks, boosting concurrent performance [9].
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages44 Page
-
File Size-