Flexible Computing with Virtual Machines

Flexible Computing with Virtual Machines by H. AndrésLagar-Cavilla A thesis submitted in conformity with the requirements for the degree of Doctor of Philosophy Graduate Department of Computer Science University of Toronto Copyright c 2009 by H. AndrésLagar-Cavilla Abstract Flexible Computing with Virtual Machines H. AndrésLagar-Cavilla Doctor of Philosophy Graduate Department of Computer Science University of Toronto 2009 This thesis is predicated upon a vision of the future of computing with a separation of functionality between core and edges, very similar to that governing the Internet itself. In this vision, the core of our computing infrastructure is made up of vast server farms with an abundance of storage and processing cycles. Centralization of computation in these farms, coupled with high-speed wired or wireless connectivity, allows for pervasive access to a highly-available and well-maintained repository for data, configurations, and applications. Computation in the edges is concerned with provisioning application state and user data to rich clients, notably mobile devices equipped with powerful displays and graphics processors. We define flexible computing as systems support for applications that dynamically leverage the resources available in the core infrastructure, or cloud. The work in this thesis focuses on two instances of flexible computing that are crucial to the realization of the aforementioned vision. Location flexibility aims to, transparently and seamlessly, migrate applications between the edges and the core based on user demand. This enables performing the interactive tasks on rich edge clients and the computational tasks on powerful core servers. Scale flexibility is the ability of applications executing in cloud environments, such as parallel jobs or clustered servers, to swiftly grow and shrink their footprint according to execution demands. This thesis shows how we can use system virtualization to implement systems that ii provide scale and location flexibility. To that effect we build and evaluate two system prototypes: Snowbird and SnowFlock. We present techniques for manipulating virtual machine state that turn running software into a malleable entity which is easily manage- able, is decoupled from the underlying hardware, and is capable of dynamic relocation and scaling. This thesis demonstrates that virtualization technology is a powerful and suitable tool to enable solutions for location and scale flexibility. iii Dedication A los que no estánpero siguen. A los que llegan con una sonrisa sin igual. Y a los que siempre estánjunto a m´ı. iv Acknowledgements First of all I must recognize the invaluable help that my academic advisors have provided in guiding my very stubborn self through the quandaries of uncertain research directions, chronic rejection of papers, and the overall existential graduate student angst. In other words, thanks for all the support, and specific gratitude to one Prof. Eyal de Lara for his undeterred willingness to lock horns with me in the domain of ideas on a weekly basis. Gracias, Eyal. Profs. Satya (M. Satyanarayanan) and Mike Brudno have also been tremendous sources of guidance, each in his own characteristic style, and each almost a de-facto co-advisor for a little while. Profs. David Lie, Angela Demke-Brown, Bianca Schroeder and Ashvin Goel have always been around to bounce ideas off. Finally, Prof. Steve Hand was an excellent, candid and insightful external. On a more practical note, this thesis describes work performed throughout a period of four years in collaboration with a number of co-authors whom I'd like to acknowledge: Niraj Tolia, Adin Scannell, Joseph Whitney, Stephen Rumble, Philip Patchin, and Profs. Dave O'Hallaron, M. Satyanarayanan, Michael Brudno and Eyal de Lara. I am pro- foundly grateful for the advice and feedback provided by these people throughout these years, and for the help provided when writing papers for submission to peer-reviewed venues. Friends and occasional collaborators, in Argentina and in Canada, new and old, thanks, always: Jing Su, Adin Scannell, Patricio Simari, Lionel Litty, Sebastian Sardi~na and everyone else I forgot or have included already. Claudia, Beatriz, Gustavo, Abuelos, y ahora Lucas, para ustedes y por ustedes. v Contents 1 Flexible Computing 1 1.1 To the Point: What is Flexible Computing Then? . 3 1.2 Virtualization . 4 1.3 Snowbird: Location Flexibility . 5 1.4 SnowFlock: Scale Flexibility . 7 1.5 Outlook . 9 1.6 Thesis Statement . 10 1.7 Organization . 12 2 Related Work 13 2.1 Virtualization . 13 2.1.1 A Bit of History on Virtual Machine Monitors . 14 2.1.2 Modern Realizations of Virtualization . 16 2.1.3 Microkernels and Containers . 19 2.2 Computation Migration Through the Stack . 20 2.2.1 Application-Specific Methods . 21 2.2.2 Toolkit-based Migration . 22 2.2.3 Language-based Migration . 23 2.2.4 Process Migration . 24 2.2.5 Container Migration . 27 vi 2.2.6 The Case for Virtual Machine-based Migration . 28 2.2.7 The Grid . 30 2.2.8 Automated Migration Policies . 31 2.3 Managing Virtual Machine State . 33 2.3.1 VM Consolidation and the Rise of the Cloud . 34 2.3.2 Cloning State . 36 3 Location Flexibility with Snowbird 38 3.1 Introduction . 38 3.2 Design and Implementation . 41 3.2.1 Choice of VM Technology . 45 3.2.2 Interaction-Aware Migration Manager . 46 3.2.3 Hardware-Accelerated Graphical Interface . 49 3.2.4 The WANDisk Storage System . 51 3.3 Usage Experience and Experimental Evaluation . 53 3.3.1 Application Benchmarks . 54 3.3.2 Interactive Session Replay . 57 3.3.3 Experimental Configurations . 58 3.3.4 Experimental WAN Testbed . 60 3.3.5 Results . 61 3.4 Implementation Limitations and Future Extensions . 65 4 Virtualized Graphics Acceleration with VMGL 68 4.1 Rationale . 68 4.2 Background . 71 4.2.1 Hardware Acceleration . 71 4.2.2 OpenGL Primer . 72 4.2.3 X11 Hardware Acceleration . 73 vii 4.2.4 Graphics Virtualization . 74 4.3 Design and Implementation . 75 4.3.1 Architecture . 76 4.3.2 OpenGL Transport . 77 4.3.3 Suspend and Resume Functionality . 78 4.3.4 Porting GPU Drivers For Xen . 79 4.3.5 Limitations . 81 4.4 Evaluation . 82 4.4.1 Benchmarks . 83 4.4.2 Experimental Setup . 85 4.4.3 Performance . 87 4.4.4 VMM Portability . 89 4.4.5 Portability Across Guest Operating System . 90 4.4.6 Suspend and Resume . 91 4.4.7 Sensitivity to Screen Resolution . 93 4.4.8 Sensitivity to Multi-Core Processing . 94 4.4.9 Concurrent Guests . 95 4.5 Work Related to VMGL . 98 4.6 Future Directions and Conclusion . 99 5 Scale Flexibility with SnowFlock 102 5.1 Introduction . 102 5.2 VM Fork: Usage and Deployment Model . 105 5.3 Challenges and Design Rationale . 107 5.4 SnowFlock Implementation . 110 5.4.1 API . 112 5.4.2 Simplifying Adoption . 113 5.4.3 VM Descriptors . 114 viii 5.4.4 Memory On Demand . 115 5.4.5 Avoidance Heuristics . 118 5.4.6 Multicast Distribution . 119 5.4.7 Virtual I/O Devices . 122 5.5 Evaluation . 124 5.5.1 Applications . 125 5.5.2 Macro Evaluation . 129 5.5.3 Micro Evaluation . 132 5.6 Summary . 142 5.6.1 Future directions for SnowFlock . 143 6 Conclusion 145 Appendix: Attribution 148 Bibliography 148 ix Chapter 1 Flexible Computing This thesis is predicated upon a vision of the future of computing which is shaped, primarily, by the following three trends: • First, a pervasively-available, high-speed, wide-area interconnect: the Internet. • Second, the economies of scale in raw processing and storage power that make widespread deployment of datacenters possible. • And third, the specialization of user-centric client hardware into relatively less powerful GUI-oriented devices, which we will call rich clients. The confluence of these three trends lends credence to a model with a separation of functionality between core and edges, very similar to that governing the Internet itself. In this model, the core of the computing infrastructure is made up of vast server farms with an abundance of storage and processing cycles. Centralization of computation in these farms, coupled with high-speed wired or wireless connectivity, allows for pervasive access to a highly-available and well-maintained repository for our data, configurations, and applications. Computation in the edges is concerned with provisioning application state and user data to rich clients, notably mobile devices such as smart phones or small portables, equipped with powerful displays and graphics processors. 1 Chapter 1. Flexible Computing 2 While sounding admittedly utopian, this vision has the potential of combining the salient characteristics of two previous iterations in our collective computing experience: the mainframe and the personal computer. The mainframe era made management of computing systems simple(r) by tasking personnel with maintaining a single (complex) computer. Users could connect at will, within the premises of the organization, and access their data and environment with a seamless transition between endpoints or terminals. These endpoints, however, were thin clients or dumb terminals, which were incapable of providing a proper interactive experience due to a lack of sophistication in their design, as well as exposure of the user interactions to network latency and jitter. Personal computers provided the average user with a dedicated private computing system. They brought along the excellent interaction capabilities crystallized in modern GUIs, and made possible the application domains of multimedia, office productiv- ity, visualization and design. However, the side-effects of endowing each user with one (and recently more than one) full-blown computer system include botnets, worm spread, widespread bitrot of domestic systems, fatal data loss or compromise, and major admin- istrative burden in the enterprise environment.

Load more