Connecting Palladio with Multicore CPU Simulators
Total Page:16
File Type:pdf, Size:1020Kb
Institute of Software Technology Reliable Software Systems University of Stuttgart Universitätsstraße 38 D–70569 Stuttgart Bachelorarbeit Connecting Palladio with Multicore CPU Simulators Sebastian Graef Course of Study: Softwaretechnik Examiner: Prof. Dr.-Ing. Steffen Becker Supervisor: Markus Frank, M.Sc. Commenced: April 16, 2018 Completed: October 16, 2018 CR-Classification: I.7.2 Abstract In Software Engineering simulators are typically used for Software Performance En- gineering (SPE). It is important that the simulations are accurate in order to allow engineers to predict the performance in detail. Palladio is one of these approaches. Currently, Palladio only supports single-core CPU simulators, but there is also an auxiliary approach for multicore simulation. The main problem of this approach is the huge inaccuracy, which is about 74% with 16 cores. This bachelor thesis aims to investigate and improve Palladio’s performance in hardware CPU simulation and performance prediction. This work presents a new approach for connecting a multicore CPU simulator to Palladio to improve the simulation accuracy. The result of this thesis is a conceptual implemen- tation of an embedded multicore CPU Simulator in Palladio to enable more accurate multicore performance predictions. The presented approach enables Palladio to connect to a multicore simulator called MaxSim via a Java prototype, but the predictions aren’t more accurate in general. With a mean speedup deviation of 67.81% at 16 cores, the simulation is only slightly more accurate for the tested system. iii Kurzfassung Softwareingenieure verwenden in der Regel Simulatoren für das Software Performance Engineering (SPE). Die Simulationsergebnisse müssen dabei genau sein, damit die Ingenieure die Leistung detailliert vorhersagen können. Palladio ist eines der Tools, welches für SPE eingesetzt wird. Aktuell unterstützt Palladio nur Single-Core CPU-Simulatoren, allerdings existiert auch ein Behilfs-Ansatz für die Multicore-Simulation. Das Problem des Ansatzes ist die enorme Ungenauigkeit, welche bei 16 Cores rund 74, 48% beträgt. Diese Bachelorarbeit zielt darauf ab, die Leistungs- fähigkeit von Palladio bei der Abbildung komplexer Architekturen auf Hardwaremodelle und die Genauigkeit der Leistungsprognosen zu untersuchen. In dieser Arbeit wird ein neuer Ansatz zur Anbindung eines Multicore-CPU-Simulators an eine bestehende Palladio-Komponente vorgestellt, um die Simulations-Genauigkeit für Multicore Leistungs-Prognosen zu verbessern. Der Vorgestellte Ansatz konnte mittels MaxSim und ProtoCom umgesetzt werden, jedoch sind die Vorhersagen im allgemeinen nicht genauer. Die Leistungsprognose ist mit einer mittleren Abweichung der Beschleunigung von −67, 81% bei 16 Cores, für den getesteten Fall lediglich unwesentlich geringer. v Contents 1. Introduction1 1.1. Problem....................................1 1.2. Idea & Benefit.................................3 1.3. Method....................................5 2. Foundations and State of the Art7 2.1. Multi-Core CPU – Architecture........................7 2.2. Palladio.................................... 10 2.3. Performance Prediction using Palladio Bench................ 11 2.4. Benchmark: “Bank Account Estimation”.................. 17 2.5. Simulators Foundations........................... 25 3. Multi-Core CPU Simulation 29 3.1. Available Multi-Core Simulators....................... 29 3.2. Required Data (Simulator Requires).................... 47 3.3. Simulator Evaluation............................. 48 4. Available Infrastructure of Palladio 53 4.1. SimuCom (Palladio Analyzer)........................ 53 4.2. ProtoCom (Palladio Analyzer)........................ 58 5. Performance Prediction Approaches 65 5.1. Palladio Trace Simulation Approach..................... 65 5.2. Prototype Simulation Approach....................... 68 5.3. Vulnerability Discussion........................... 81 5.4. Experiment.................................. 82 6. Evaluation 85 6.1. Evaluation Metrics.............................. 85 6.2. Evaluation................................... 86 vii 6.3. Discussion................................... 91 6.4. Lessons Learned............................... 91 7. Conclusion 93 7.1. Summary................................... 93 7.2. Future Work.................................. 94 A. Simulator Build – Dockerfiles 95 A.1. ZSIM...................................... 95 A.2. Tejas...................................... 97 A.3. MaxSim.................................... 103 A.4. Multi2Sim................................... 106 A.5. Sniper..................................... 107 A.6. Gem5..................................... 109 B. Conversations 111 B.1. Tejas...................................... 111 B.2. MaxSim.................................... 112 C. Issues 115 C.1. ProtoCom Issues............................... 116 C.2. Simulation Errors............................... 119 Bibliography 121 viii List of Figures 1.1. Processor-Trend of 40 Years [10],[11]...................2 1.2. Difference between Reality and Prediction (Speedup) [12, fig. 1]....3 1.3. Goal of this Approach............................6 2.1. Multi-Core Memory Design Spectrum [17, fig. 9].............8 2.2. Haswell Octa-Core Microprocessor Cache (cf. [19])............9 2.3. Elements of the Palladio Component Model (PCM) [20, fig. 2.1] (cf. Figure 2.5).................................. 10 2.4. palladio multi-partial models........................ 11 2.5. PCM: Process [25, fig. 1]).......................... 14 2.6. selection process of analyzing approach [9]................ 15 2.7. Repository Diagram of Bank Account Estimation - Singlecore....... 18 2.8. Assembly and Resource Environment Diagram of Bank Account Estimation - Singlecore.................................. 19 2.9. Allocation and Usage Model Diagram of Bank Account Estimation - single threaded Palladio project........................... 20 2.10.Repository Diagram of Bank Account Estimation - two core actors model - Palladio project................................ 21 2.11.SEFF Diagram for executeTransaction() of Bank Account Estimation - mul- ticore actors model - Palladio project.................... 22 2.12.SEFF Diagram for simulateTransaction() of Bank Account Estimation - two core actors model - Palladio project..................... 23 2.13.Bank Account Estimation - two core actors model - Palladio project.... 24 2.14.Simulation of target components...................... 25 3.1. Sniper: Simulator Tradeoffs......................... 30 3.2. ZSIM: Simulator Tradeoffs.......................... 31 3.3. GEM5: Simulator Tradeoffs......................... 33 3.4. MARSS: Simulator Tradeoffs......................... 34 3.5. MARSSx86: System Architecture [40]................... 35 ix 3.6. Multi2Sim: Simulator Tradeoffs....................... 36 3.7. Multi2Sim: System Architecture [41, fig. 1.2]............... 37 3.8. Multi2Sim: x86 Timing Simulator – Multicore and MultiThread [42, p.20] 38 3.9. Tejas: Simulator Tradeoffs.......................... 39 3.10.Tejas: System Architecture (cf. [48]).................... 40 3.11.Tejas-Java: System Architecture and Process (cf. [49]).......... 41 3.12.Tejas-Java: Post Process (cf. [50])...................... 42 3.13.MaxSim: Simulator Tradeoffs........................ 44 3.14.MaxSim: Information Handling (cf. [54, p.11]).............. 44 3.15.Venn Diagram: Sets of Simulators (cf. Table 3.2)............. 50 3.16.Benchmark Results: Average Absolute Errors................ 51 4.1. Overview on SimuCom’s Transformation Structure............ 53 4.2. Overview on SimuCom’s Parts........................ 54 4.3. Transformation of PCM instances to performance prototypes....... 58 4.4. Protocom: Transformations Overview [20, fig. 3.1]............ 59 4.5. Protocom: Components of Performance Prototypes (JavaSE) [20, fig. 3.2] 60 5.1. Prozess Overview – Simucom Approach.................. 66 5.2. Prozess Overview – Protocom Approach.................. 68 5.3. Java SE RMI Prediction Prototype - PCM influences............ 69 5.4. Sequence Diagram for Initialisation and Assembly using RMI [62, fig. 1] 70 5.5. Sequence Diagram for Prototype usage................... 71 5.6. ProtoCom Approach: Calibration Process.................. 74 5.7. ProtoCom Approach – Simulation Process................. 76 6.1. Palladio Analyzer SimuCom......................... 87 6.2. MaxSim Simulator (cf. Table 6.2)...................... 88 6.3. Speedup Comparison............................. 90 x List of Tables 3.1. Intel Xeon CPU Specification......................... 48 3.2. Simulators Property Overview........................ 49 3.3. Simulators Selection............................. 52 4.1. Overview of the simulation mapping for SimuCom............ 55 6.1. SimuCom: Performance Prediction..................... 86 6.2. MaxSim: Simulation Results (16 Core Hardware)............. 88 6.3. Execution Times............................... 89 6.4. Simulation Speedup Comparison...................... 89 xi List of Acronyms ASLR Address Space Layout Randomization DESMO-J Discrete-Event Simulation Modelling in Java ELF Executable and Linking Format GCC Gnu Compiler Collection ISA Instruction Set Architecture M2C Model to Code Transformation M2T Model to Text Transformation MTTF Mean Time To Failure MTTR Mean Time To Recover OOO Out-Of-Order PCM Palladio Component Model PDF Probability Density Function QoS Quality of Service RMI Remote Method Invocation SEFF Service Effect Specification SPE Software Performance Engineering SSJ Stochastic