Quantifying and Predicting the Influence of Execution Environment
Total Page:16
File Type:pdf, Size:1020Kb
Michael Kuperberg Quantifying and Predicting the Influence of Execution Platform on Software Component Performance The Karlsruhe Series on Software Design and Quality Volume 5 Chair Software Design and Quality Faculty of Computer Science Karlsruhe Institute of Technology and Software Engineering Division Research Center for Information Technology (FZI), Karlsruhe Editor: Prof. Dr. Ralf Reussner Quantifying and Predicting the Influence of Execution Platform on Software Component Performance by Michael Kuperberg Dissertation, Karlsruher Institut für Technologie Fakultät für Informatik Tag der mündlichen Prüfung: 4. November 2010 Referenten: Prof. Dr. Ralf Reussner, Prof. Dr. Walter F. Tichy Impressum Karlsruher Institut für Technologie (KIT) KIT Scientific Publishing Straße am Forum 2 D-76131 Karlsruhe www.ksp.kit.edu KIT – Universität des Landes Baden-Württemberg und nationales Forschungszentrum in der Helmholtz-Gemeinschaft Diese Veröffentlichung ist im Internet unter folgender Creative Commons-Lizenz publiziert: http://creativecommons.org/licenses/by-nc-nd/3.0/de/ KIT Scientific Publishing 2011 Print on Demand ISSN 1867-0067 ISBN 978-3-86644-741-7 Abstract Software engineering is concerned with the cost-efficient construction of app- lications which behave as specified, are well-designed and of high quality. Among software quality attributes, performance is one of most prominent and well-studied. Performance evaluation is concerned with explaining, predicting and preventing long waiting times, overloaded bottleneck resources and other performance problems. However, performance remains hard to evaluate because it depends not only on software implementation, but also on several other factors such as the work- load and the execution platform on which the software runs. The execution plat- form comprises hardware resources (CPU, networks, hard disks) and software resources (operating system, middleware). In former approaches, the influence of the execution platform was a hard-wired part of the model, and not an ad- justable parameter. This meant that to answer sizing and relocation questions, a performance model had to be recreated and quantified for each candidate exe- cution platform. The resulting challenge addressed by this thesis is to devise an effective ap- proach for quantifying and predicting the influence of the execution platform on software performance, using Model-Based Performance Evaluation (MBPE) at the level of software architecture. The primary targeted benefit is a decrease of the effort needed for performance prediction, since answering sizing and re- location questions no longer needs the deployment and measurement of the considered application on every candidate execution platform. The application of MBPE starts at design time since delaying performance evaluation until the implementation of the software is not desirable: the refact- oring costs increase with the degree of completeness and deployment. To model the artefacts of the software application, MBPE builds upon the well-studied concept of software components and their required and provided services as exchangeable building blocks which facilitate recomposition and reuse. In most MBPE approaches, the atomic behaviour actions of components carry timing values. On the basis of these timing values, an analysis of the overall applica- tion behaviour (e.g. prediction of response times) is then performed. Unfortunately, such timing values are platform-specific and the resulting ar- chitectural model is also platform-specific. Therefore, the model needs to be rebuilt for each considered execution platform and for each usage profile. Addi- tionally, the durations of atomic component actions often amount to just a few nanoseconds, and measuring such fine-granular actions is challenging because conventional timer methods are too coarse for them. The contribution of this thesis is a novel approach to quantify and to predict both platform-independent and platform-dependent resource demands on the basis of performance models. Using automated benchmarking of the execution platform, the approach is able to make precise, platform-specific performance predictions on the basis of these models, without manual effort. By separating the performance evaluation of the application from the performance evaluation of the execution platform, the effort to consider different platforms (e.g. for relo- cation or sizing scenarios) is significantly decreased, since it is no longer needed to deploy the application on each candidate platform. To select the timer meth- ods used in measurements, this thesis introduces a novel platform-independent algorithm which quantifies timer quality metrics (e.g. accuracy and overhead). Building on the Palladio Component Model (PCM) and its tooling, the imple- mentation of the approach provides a convenient user interface and a validated theoretical foundation. The resource demands are parametrised over the usage (workload) of the considered components, and are expressed as annotations in the PCM-based behaviour model of the component. To integrate the presented approach into the PCM, new meta-model concepts have been introduced into the PCM, and corresponding tooling has been added. The enhanced PCM workbench allows for automated creation of PCM model ii instances from black-box bytecode components, and also includes concepts and tools to convert benchmarking results into PCM resource models. The presented approach focuses on applications that will run as platform- independent bytecode on bytecode-executing virtual machines (BEVMs) such as the Java VM. It accounts for dynamic and static optimisations performed in modern BEVMs, e.g. just-in-time compilation (JIT) and inlining. To translate the platform-independent resource demands into timing values, this thesis in- troduces a benchmark suite for BEVMs. This benchmark suite addresses both fine-granular bytecode instructions (e.g. integer addition or array initialisation) and platform API methods provided by BEVM’s base libraries, e.g. by the Java Platform API. Unlike existing approaches, the contribution of this thesis • does not require modification or instrumentation of the execution platform • quantifies the performance speedups of the execution platform (e.g. just- in-time compilation) and reflects them during performance prediction • deals with API and library methods in an atomic way, providing method- level benchmarking results which are more intuitive than per-instruction timings • provides more detailed per-invocation performance results than conven- tional profilers, and supports stochastic distributions of performance val- ues, which are more realistic and information-richer than conventional av- erage or median metrics An extensive validation of performance prediction capabilities offered by the new approach was performed on a number of Java applications, such as widely used SPECjvm2008, SPECjvm98, SPECjbb2005 and Linpack benchmarks. The validation demonstrated the prediction accuracy of bytecode-based cross- platform performance prediction, and showed that it has significantly better results than prediction based on CPU cycles. The validation used one exe- cution platform as a basis to obtain platform-independent resource demands, iii and predicted the performance of the application on other execution platforms (which were significantly different from the basis platform) without deploying and benchmarking the application on them. The validation also addressed indi- vidual parts of the presented approach: the precision and the overhead of the re- source demand quantification were studied, and the heuristics-based approach for automated method benchmarking was evaluated w.r.t. its effectiveness, cov- erage and precision of the benchmarking results. A large comparison of timer methods on the basis of quality attributes was performed on several Java and .NET platforms. iv Zusammenfassung Software Engineering beschäftigt sich mit kosteneffektiver Konstruktion von qualitativ hochwertigen Softwareanwendungen, deren Verhalten einer vorgegebenen Spezifikation folgt und denen ein zielgerichteter Entwurf zu- grundeliegt. Unter den Qualitätsattributen von Software nimmt die Perform- ance eine zentrale Rolle ein und wird dementsprechend intensiv erforscht. Der Forschungsbereich Performance-Analyse beschäftigt sich mit Messung, Model- lierung und Vorhersage von Performance, um Performance-Probleme wie z.B. überlastete Ressourcen zu erklären und ihnen vorzubeugen. Performance-Analyse bietet zahlreiche Herausforderungen und offene Forschungsfragen, da die Performance einer Applikation in komplexer Weise von Faktoren wie Implementierung, Nutzlast und Ausführungsumgebung ab- hängt. Die Ausführungsumgebung beinhaltet Hardware-Ressourcen wie z.B. CPU und Festplatte, aber auch Software-Ressourcen wie das Betriebssystem oder die Middleware. In früheren Modellierungsansätzen war der Einfluss der Ausführungsumgebung als ein konstanter und fixierter Faktor enthalten, so- dass das Modell für Vergleiche der Ausführungsumgebungen oder für Frages- tellungen zur Ressourcendimensionierung mehrfach neu aufgestellt werden musste. Daraus ergibt sich die in dieser Doktorarbeit angegangene Herausforderung, einen effektiven Ansatz zur Vorhersage des Einflusses der Ausführungsumge- bung auf Software-Performance zu entwickeln. Der zu entwickelnde Ansatz soll ohne Installation und Messung der analysierten Applikation auf jeder der betrachteten Ausführungsumgebungen auskommen. Dieser Ansatz soll als Be- standteil von modellbasierter Performance-Analyse