Virtual Machine Scheduling in Dedicated Computing Clusters

Virtual Machine Scheduling in Dedicated Computing Clusters Dissertation for attaining the PhD degree of Natural Sciences submitted to the Faculty of Computer Science and Mathematics of the Johann Wolfgang Goethe University in Frankfurt am Main, Germany CERN-THESIS-2012-391 14/11/2013 by Stefan Boettger born in Torgau, Germany Frankfurt 2012 (D30) accepted by the Faculty of Computer Science and Mathematics of the Johann Wolfgang Goethe University as a dissertation. Dean: Prof. Dr. Thorsten Theobald Expert assessor: Prof. Dr. Udo Kebschull Prof. Dott. Ing. Roberto V. Zicari Date of disputation: Time-critical applications process a continuous stream of input data and have to meet specific tim- ing constraints. A common approach to ensure that such an application satisfies its constraints is over-provisioning: The application is deployed in a dedicated cluster environment with enough pro- cessing power to achieve the target performance for every specified data input rate. This approach comes with a drawback: At times of decreased data input rates, the cluster resources are not fully utilized. A typical use case is the HLT-Chain application that processes physics data at runtime of the ALICE experiment at CERN. From a perspective of cost and efficiency it is desirable to exploit temporarily unused cluster resources. Existing approaches aim for that goal by running additional applications. These approaches, however, a) lack in flexibility to dynamically grant the time-critical application the resources it needs, b) are insufficient for isolating the time-critical application from harmful side-effects introduced by additional applications or c) are not general because application- specific interfaces are used. In this thesis, a software framework is presented that allows to exploit unused resources in a dedicated cluster without harming a time-critical application. Additional applications are hosted in Virtual Machines (VMs) and unused cluster resources are allocated to these VMs at runtime. In order to avoid resource bottlenecks, the resource usage of VMs is dynamically modified according to the needs of the time-critical application. For this purpose, a number of pre- viously not combined methods is used. On a global level, appropriate VM manipulations like hot migration, suspend/resume and start/stop are determined by an informed search heuristic and applied at runtime. Locally on cluster nodes, a feedback-controlled adaption of VM resource usage is carried out in a decentralized manner. The employment of this framework allows to increase a cluster’s usage by running additional applications, while at the same time preventing negative impact towards a time-critical application. This capability of the framework is shown for the HLT-Chain application: In an empirical evaluation the cluster CPU usage is increased from 49% to 79%, additional results are computed and no negative effect towards the HLT-Chain application are observed. Acknowledgements For inspiration and guidance on the road to this thesis I thank my professor Mr. Kebschull. Both valuable feedback and daily life companionship from my colleagues Timo Breitner and Jochen Ulrich also helped me in mastering this task. Special thanks goes to my colleague Camilo Lara whom I owe much for professional advice and fruitful discussions. My parents I thank for all their unconditional trust and support. Heinke, thank you for your understanding and sincerity. Contents 1. Introduction 11 2. Thesis Goal 15 3. Context and Related Work 17 3.1. Research Overview . 18 3.1.1. Goals in Current Research . 19 3.1.2. Resource Allocation Primitives . 22 3.1.3. Data Used for Decision Making . 25 3.1.4. Decision Techniques . 27 3.2. Evaluation of Relevant Approaches . 30 3.3. Summary . 34 4. Conceptual Work 35 4.1. Scope Discussion . 35 4.2. Basic Terminology . 36 4.3. Conceptual Cornerstones . 39 4.3.1. Resource Usage as Sensor Metric . 40 4.3.2. Virtualization as Enabling Technology . 43 4.3.3. Virtual Machine Life-Cycle . 45 4.3.4. Secondary Application Abstraction . 47 4.3.5. Constraints on Resource Usage: Policies . 49 4.3.6. Decision Making Architecture . 52 4.3.7. Summary . 55 4.4. Resource Allocation Functionality . 56 4.4.1. Feasibility Estimator . 56 4.4.2. Global Provisioner . 60 4.4.3. Global Reconfigurator . 61 4.4.3.1. Recognition Phase and Triggering . 62 4.4.3.2. Search Algorithm . 65 4.4.4. Local Controller . 81 4.4.4.1. Resource Usage Adaptor . 81 4.4.4.2. Local Trigger Unit . 84 4.4.5. Functional Unit Coordination . 85 4.4.5.1. Execution and Concurrency . 85 4.4.5.2. Enforcement of Policy Semantics . 87 4.4.6. Further Considerations . 88 4.4.6.1. Computational Complexity . 88 4.4.6.2. Scalability . 91 4.4.6.3. Self-Sustaining Behavior . 92 5 Contents 4.4.7. Summary . 93 5. Realization 95 5.1. Environment, Products and Tools . 95 5.2. Framework Design . 96 5.3. Actors and Use Cases . 98 5.4. Platform-Specific Implementation . 101 6. Experimental Results 105 6.1. Test Environment . 105 6.2. Experiment Overview . 105 6.3. Policy Modification Experiment . 107 6.3.1. Experiment Layout . 108 6.3.2. Results . 110 6.3.3. Summary . 115 6.4. Efficiency Experiment . 116 6.4.1. Metrics Discussion . 116 6.4.2. Experiment Layout . 117 6.4.3. Results . 119 6.4.4. Summary . 130 6.5. Interference Experiment . 130 6.5.1. Experiment Layout . 131 6.5.2. Results . 132 6.5.3. Summary . 136 6.6. HLT-Chain Experiment . 136 6.6.1. HLT-Chain Application . 137 6.6.2. Experiment Layout . 138 6.6.3. Results . 139 6.6.4. Summary . 143 6.7. Summary of Empirical Results . 144 7. Summary and Conclusion 145 7.1. Summary . 145 7.2. Conclusion . 147 8. Future Work 149 A. Appendix 151 A.1. HLT-Chain Application Setup . 151 A.2. Derivation of Complexity Formulas . 151 A.3. Cloud Computing and EaaS Stack . 153 A.4. Virtualization Technologies and Products . 155 A.5. Feasibility Estimator Customization . 158 B. Abbreviations 167 6 List of Figures 4.1. Virtual Machine Life-Cycle . 45 4.2. Lease Life-Cycle . 47 4.3. Basic Framework Architecture . 54 4.4. Example Search Tree . 68 4.5. Search Example: Initial Cluster State . 70 4.6. Search Example: Intermediate Search Tree I . 70 4.7. Search Example: Intermediate Cluster State I . 71 4.8. Search Example: Intermediate Search Tree II . 71 4.9. Search Example: Intermediate Cluster State II . 72 4.10. Search Example: Final Search Tree . 73 4.11. Search Example: Final Cluster State . 73 4.12. Control Loop of RUA . 83 4.13. Concurrency of Reconfiguration Cycles . 86 5.1. Package Diagram of the VM-Scheduling Framework . 97 5.2. Use Cases for the VM-Scheduling Framework . 99 6.1. Test Environment KIP Cluster In Heidelberg, Germany . 106 6.2. Modification of CPU Core Availability for VMs in a Cluster . 109 6.3. Manipulations Chosen for Removing VMs from 6 Nodes . 110 6.4. Manipulations Chosen for Removing VMs from 12 Nodes . 111 6.5. Manipulations Chosen for Removing VMs from 24 Nodes . 111 6.6. Manipulations Chosen for Modifying Policies Every 60 Seconds . 112 6.7. Manipulations Chosen for Modifying Policies Every 40 Seconds . 113 6.8. Manipulations Chosen for Modifying Policies Every 20 Seconds . 113 6.9. SecApp Results for Different Policy Modification Intervals and Timelimits . 115 6.10. Patterns for Generating CPU Usage on a Physical Node . 118 6.11. Probability Densities for Generating a Mean CPU Usage . 119 6.12. Generated CPU Usage and Resulting VM CPU Usage for Different Thresholds . 120 6.13. VM CPU Usage and SecApp Results for Different Thresholds . 121 6.14..

Virtual Machine Scheduling in Dedicated Computing Clusters

Solaris-Cluster-Businesscontinuity-168285.Pdf

Beginner's Guide to Oracle Grid Engine 6.2 Oracle White Paper—Beginner's Guide to Oracle Grid Engine 6.2

7.1 Task Computing 7.2 Task-Based Application Models 7

Hepix Report

Development of Technological Projects with the Active Participation of Women in El Salvador

HP-UX to Oracle Solaris Porting Guide Getting Started on the Move to Oracle Solaris

Prosabladet Udgivelse: 2011/6

Study of Scheduling Optimization Through the Batch Job Logs Analysis 윤 준 원1 · 송 의 성2* 1한국과학기술정보연구원 슈퍼컴퓨팅본부 2부산교육대학교 컴퓨터교육과

Linux Control Groups Univa Grid Engine

Workload Schedulers - Genesis, Algorithms and Comparisons

Univa Grid Engine Status at CC-IN2P3 Vanessa HAMAR for the CCIN2P3 Batch Team Overview

Oracle Grid Engine: an Overview