Workload Characterization of JVM Languages

View metadata, citation and similar papers at core.ac.uk brought to you by CORE provided by RERO DOC Digital Library Workload Characterization of JVM Languages Doctoral Dissertation submitted to the Faculty of Informatics of the Università della Svizzera Italiana in partial fulfillment of the requirements for the degree of Doctor of Philosophy presented by Aibek Sarimbekov under the supervision of Prof. Walter Binder May 2014 Dissertation Committee Prof. Matthias Hauswirth Università della Svizzera Italiana, Switzerland Prof. Cesare Pautasso Università della Svizzera Italiana, Switzerland Prof. Andreas Krall Technische Universität Wien, Vienna, Austria Prof. Petr Tuma˚ Charles University, Prague, Czech Republic Dissertation accepted on 9 May 2014 Prof. Walter Binder Research Advisor Università della Svizzera Italiana, Switzerland Prof. Stefan Wolf and Prof. Igor Pivkin PhD Program Director i I certify that except where due acknowledgement has been given, the work presented in this thesis is that of the author alone; the work has not been submitted previously, in whole or in part, to qualify for any other academic award; and the content of the thesis is the result of work which has been carried out since the official com- mencement date of the approved research program. Aibek Sarimbekov Lugano, 9 May 2014 ii To my family iii iv Everything should be made as simple as possible, but not simpler. Albert Einstein v vi Abstract Being developed with a single language in mind, namely Java, the Java Virtual Machine (JVM) nowadays is targeted by numerous programming languages. Automatic memory management, Just-In-Time (JIT) compilation, and adaptive optimizations provided by the JVM make it an attractive target for different language implementations. Even though being targeted by so many languages, the JVM has been tuned with respect to characteristics of Java programs only – different heuristics for the garbage collector or compiler optimizations are focused more on Java programs. In this dissertation, we aim at contributing to the understanding of the workloads imposed on the JVM by both dynamically-typed and statically-typed JVM languages. We introduce a new set of dynamic metrics and an easy-to-use toolchain for collecting the latter. We apply our toolchain to applications written in six JVM languages – Java, Scala, Clojure, Jython, JRuby, and JavaScript. We identify differences and commonalities between the examined languages and discuss their implications. Moreover, we have a close look at one of the most efficient compiler optimizations – method inlining. We present the decision tree of the HotSpot JVM’s JIT compiler and analyze how well the JVM performs in inlining the workloads written in different JVM languages. vii viii Acknowledgements First of all I would like to thank my research adviser Professor Walter Binder who guided me for the past four years through this interesting journey called Ph.D. I am also grateful for my committee members including Professor Andreas Krall, Professor Petr Tuma,˚ Professor Cesare Pautasso, and Professor Matthias Hauswirth, your valuable comments helped me a lot to strengthen this dissertation. I would like to thank all the past and present members of the Dynamic Analysis Group with whom I had a pleasure to work with during my Ph.D at University of Lugano: Philippe, Danilo, Akira, Alex, Achille, Stephen, Yudi, Lukáš, Lubomír, Andrej, Sebastiano, Andrea – you guys are awesome! I am thankful to my collaborators and co-authors from different parts of the world: Danilo Ansaloni, Walter Binder, Christoph Bockisch, Eric Bodden, Lubomír Bulej, Samuel Z. Guyer, Kardelen Hatun, Stephen Kell, Lukáš Marek, Mira Mezini, Philippe Moret, Zhengwei Qi, Nathan P. Ricci, Martin Schoeberl, Andreas Sewe, Lukas Stadler, Petr Tuma,˚ Alex Villazón, Yudi Zheng. In particular I am grateful to Andreas Sewe, whose energy and motivation still impresses me. Thank you Andreas for all your help and support throughout my Ph.D. Elisa, Janine, Danijela and Diana from the decanato office deserve special thanks for their support and help with numerous bookings and registration stuff. I cannot find the words to thank you. Maksym and Konstantin I will miss a lot our discussions during the lunches and coffee breaks, however, I am more than sure that our friendship will last for good. And finally I would like to thank my family that keeps supporting me in my every venture. This work is dedicated to every member of my family. This work was partially funded by Swiss National Science Foundation (project CRSII2_136225). ix x Contents Contents xi 1 Introduction1 1.1 Motivation......................................1 1.2 Contributions....................................3 1.2.1 Toolchain for Workload Characterization...............3 1.2.2 Rapid Development of Dynamic Program Analysis Tools......3 1.2.3 Workload Characterization of JVM Languages............4 1.3 Dissertation Outline................................4 1.4 Publications.....................................5 2 State of the Art9 2.1 Overview.......................................9 2.2 Feedback-Directed Optimizations........................9 2.3 Workload Characterization............................ 10 2.4 Programming Languages Comparison...................... 13 2.5 Instrumentation................................... 14 2.5.1 Binary Instrumentation.......................... 14 2.5.2 Bytecode Instrumentation........................ 15 3 Rapid Development of Dynamic Program Analysis Tools 19 3.1 Motivation...................................... 19 3.2 Background: DiSL Overview........................... 21 3.3 Quantifying the Impact of DiSL.......................... 23 3.3.1 Experiment Design............................. 23 3.3.2 Task Design................................. 26 3.3.3 Subjects and Experimental Procedure................. 26 3.3.4 Variables and Analysis........................... 27 3.3.5 Experimental Results........................... 28 3.3.6 Threats to Validity............................. 29 3.4 DiSL Tools Are Concise and Efficient...................... 31 3.4.1 Overview of Recasted Analysis Tools.................. 31 xi xii Contents 3.4.2 Instrumentation Conciseness Evaluation............... 34 3.4.3 Instrumentation Performance Evaluation............... 36 3.5 Summary....................................... 39 4 Toolchain for Workload Characterization 41 4.1 Dynamic Metrics................................... 41 4.1.1 Object access concerns.......................... 42 4.1.2 Object allocation concerns........................ 44 4.1.3 Code generation concerns........................ 46 4.2 Toolchain Description............................... 48 4.2.1 Deployment and Use........................... 48 4.2.2 Query-based Metrics............................ 50 4.2.3 Instrumentation.............................. 51 5 Workload Characterization of JVM Languages 53 5.1 Experimental Setup................................. 53 5.1.1 Examined Metrics............................. 53 5.1.2 Workloads.................................. 54 5.1.3 Measurement Context........................... 55 5.2 Experiment Results................................. 55 5.2.1 Call-site Polymorphism.......................... 55 5.2.2 Field, Object, and Class Immutability................. 64 5.2.3 Identity Hash-code Usage........................ 68 5.2.4 Unnecessary Zeroing........................... 70 5.2.5 Object Lifetimes.............................. 72 5.3 Inlining in the JVM................................. 77 5.3.1 Inlining Decision.............................. 77 5.3.2 Top Reasons for Inlining Methods................... 80 5.3.3 Top Reasons for Not Inlining Methods................. 83 5.3.4 Inlining Depths............................... 85 5.3.5 Fraction of Inlined Methods....................... 89 5.3.6 Fraction of Decompiled Methods.................... 91 5.3.7 Summary.................................. 93 6 Conclusions 95 6.1 Summary of Contributions............................ 96 6.2 Future Work..................................... 97 Bibliography 99 Chapter 1 Introduction 1.1 Motivation Modern applications have brought about a paradigm shift in software development. With different application parts calling for different levels of performance and produc- tivity, developers use different languages for the various tasks at hand. Very often, core application parts are written in a statically-typed languages, data management uses a suitable domain-specific language (DSL), and the glue code that binds all the pieces together is written in a dynamically-typed language. This development style is called polyglot programming, and promotes using the right language for a particular task [90]. This shift has gained support even in popular managed runtimes such as the .NET Common Language Runtime and the Java Virtual Machine (JVM), which are mainly known for automatic memory management, high performance through Just-In-Time (JIT) compilation and optimization, and a rich class library. In case of the JVM, the introduction of scripting support (JSR-223) and support for dynamically-typed languages (JSR-292) to the Java platform enables scripting in Java programs and simplifies the development of dynamic language runtimes. Consequently, developers of literally hun- dreds of programming languages target the JVM as the host for their language—both to avoid developing a new runtime from scratch, and to benefit from the JVM’s maturity, ubiquity, and performance. Today, programs written in popular dynamic languages such

Load more