Climbing Mont Blanc and Scalability
Total Page:16
File Type:pdf, Size:1020Kb
Climbing Mont Blanc and Scalability Christian Chavez Master of Science in Computer Science Submission date: July 2016 Supervisor: Lasse Natvig, IDI Norwegian University of Science and Technology Department of Computer and Information Science Problem Statement Climbing Mont Blanc and Scalability Climbing Mont Blanc (CMB) is a system for evaluation of programs executed on modern heterogeneous multi-cores such as the Exynos Octa chips used in, e.g., Samsung Galaxy S5 and S6 mobile phones, see https://www.ntnu.edu/idi/card/cmb. CMB evaluates both performance and energy efficiency and provides the possibility of performance rank- ing lists and online competitions. A first version of the system is available and under trial use. This master thesis project is focused on improving the system with increased scalability so that the system can handle more user submissions per hour. The project involves the following subtasks: 1. Study the existing solution in CMB for automatic system monitoring and recovery and suggest improvements. 2. Describe and implement a dispatcher in the current CMB system that will allow the use of multiple XU3 backends to serve concurrent user submissions. 3. Test the dispatcher with two or three XU3 boards. Implement a simple script for generating a synthetic load (simulating users and submission) to be able to evaluate the scalability of a new CMB variant using the developed dispatcher. 4. Describe how the dispatcher can be used to allow different boards than the XU3 to be used together with XU3 boards. Discuss what effects this will have on other parts of the system. If time permits: 5. Propose possible throttling techniques for cases where CMB gets too many/frequent submissions from a single user, or in total from all active users. 6. Propose compilation/Makefile improvements, and the possibility of a makefile that can be edited by problem-setter through admin-interface. 7. Propose how more detailed statistics such as performance counter values can be given as feedback to the CMB user. 8. Propose the addition of more programming languages and libraries. 9. Propose an extension of the dispatcher into a load-balancing broker. 10. Implement some of the proposed solutions after approval by, and in collaboration with the CMB team. The master thesis project is part of the EECS Strategic Research project at IME (www. ntnu.edu/ime/eecs). Supervisor: Prof. Lasse Natvig i Abstract This thesis details a proposed system implementation upgrade for the CMB system, accessible at climb. idi.ntnu.no, which profiles C/C++ code for its en- ergy efficiency on an Odroid-XU3 board, which utilises a Samsung Exynos 5 Octa CPU, and has an ARM Mali-T628 GPU. Our proposed system implementa- tion improves the robustness of the code base and its execution, in addition to permitting an increased throughput of submissions profiled by the system with the implementation’s dispatcher which allows the sys- tem to utilise several Odroid-XU3 backends for the energy and timing measurement profiling. Our tests show that our implementation can achieve more than 4x speedup with “Hello World” submissions using a parallelized web server, and around 2x speedup with “Shortest Path” submissions using a serial web server. ii Preface I have many I want to acknowledge for all their help, support, and comradeship during my time in academia (not just for the duration of this master project, at the end of the line), but individuals of note include the following: • Lasse Natvig, my supervisor for this Master project, for his patience, wisdom, and guidance through the course of this project. • IDI’s Technical group, especially Arne Dag Fidjestøl, Jan Grønsberg, and Erik Houmb, for their support and tips in the development of this project’s proposed system implementation. • Sindre Magnussen, for his help in understanding and learning the workings of the CMB system. • Dag Frode Solberg, Christoffer Viken, (and the rest of the crowd from NTNU’s PVV), for all those hours, spent as my rubber ducks and coming with tips and guidance when my technical competence was insufficient. • Finn Inderhaug Holme, for saving my bacon when the power connections of the equipment in Trondheim had to be disconnected and reconnected after I had moved to Oslo due to the delays incurred during this project. • My family for their support and love throughout. • And anyone else whose notable assistance may have (temporarily!) been forgotten during the time of this writing. Note: This report is rather long and was not condensed down as is the norm, due to the time constraints described on the next page. However, I want to make it clear that if you care about trees, you should not print this report in its entirety. Chapters5, 6, and AppendixesA,C,D, are all rather long, and especially boring to read on paper. iii The main challenge faced during this project At the outset of this 21-week, contractually decided period allotted for this master project, Celery was chosen by the author of this project with the support of the supervisor, Lasse Natvig, to be a promising and efficient way to solve a majority of the challenges of this project. However, on the 15th of March, in a meeting with a member of the institutes’s (IDI’s) Technical Group (Arne Dag Fidjestøl), and the other master student currently working on the CMB project (Sindre Magnussen), it was concluded that Celery, while fit for the task, was introducing more complexity into the system, than what was currently needed. This conclusion was based on the fact that the earlier project future use estimates of the CMB system were too ambitious, and thus the CMB system did not need to support so potentially high frequencies and concurrently submitted submissions by its user base. Therefore, in week 10 of this project (start of the project’s 21-week allotted time was the 11th of January), it was decided that a proposed implementation based upon the use of Celery, would be of little use to future iterations of the CMB system, at this time. Thus, the author of this project has reversed all efforts of implementing the use of Celery into CMB and has instead landed upon (and implemented) the proposed system implementa- tion described in Chapter4. In addition to compensation for a three week documented sick-leave, this project has received an extension of two additional weeks to compensate for this setback. On top of all that, the author of this paper had to move residence from Trondheim to Oslo in the last four weeks of this project, as the move had been planned and scheduled before the delays happened and the compensation was given. The author of this paper has had to omit goals and desired implementations/improve- ments of this project (and paper research), due to this time-constraining setback, but wants to state that had this project been granted 4-6 more weeks, the complete (or near- complete) implementation of both the database, and automatic system monitoring and recovery (described in Chapters4 and9) may well have been realized. Abbreviations and Glossary ARM = Advanced RISC Machine (http://www.arm.com/) BSC = Barcelona Supercomputing Center CMB = Climbing Mont Blanc https://climb.idi.ntnu.no/ HPC = High-Performance Computing MB = Mont-Blanc (The EU Project: https://www.montblanc-project.eu/) SoC = System(s)-on-Chip VM = Virtual Machine iv Table of Contents Problem Statementi Abstract ii Preface iii Abbreviations and Glossary iv Table of Contentsv List of Tables ix List of Figuresx List of Listings1 1 Introduction3 1.1 Motivation...................................3 1.2 Project Goals..................................4 1.2.1 Automatic System Monitoring and Recovery............5 1.2.2 The Dispatcher.............................5 1.3 Thesis Structure................................5 2 Background7 2.1 Mont-Blanc, The EU Project.........................7 2.2 The Climbing Mont Blanc System......................8 2.3 Odroid-XU3...................................9 2.4 Concurrency Softwares Considered......................9 2.4.1 ZeroMQ................................. 10 2.4.2 Celery.................................. 11 v 3 Related Work 15 3.1 Online Judges - Websites for Competitive Programming.......... 15 3.2 Backend Parallelization Projects....................... 16 4 Proposed System Solution 19 4.1 The outset state of CMB........................... 19 4.1.1 Proposed solution to environment variables/settings........ 22 4.2 Automatic system monitoring and recovery................. 23 4.3 Upgrade from Python 2.7 to 3.4........................ 26 4.3.1 Git submodules............................. 27 4.4 Database changes................................ 27 4.4.1 Changes necessitated by the Dispatcher............... 28 4.4.2 Potential changes for software language support........... 29 4.5 The Dispatcher................................. 29 4.6 The Backend.................................. 31 5 Server Installation Instructions 35 5.1 Getting the Code................................ 36 5.2 Install Instructions and Pre-Requisites.................... 37 5.3 Starting the Server............................... 39 5.3.1 Start-up Script Differences...................... 39 6 Backend Installation Instructions 40 6.1 Install Instructions and Pre-Requisites.................... 41 6.2 Getting the code................................ 44 6.3 Starting the Backend.............................. 46 7 Methodology 47 7.1 Hardware & Hardware Set-Up......................... 47 7.1.1 CMB Server............................... 48 7.1.2 Backends................................ 48 7.2 Software