Techniques for Collective Physical Memory

TECHNIQUES FOR COLLECTIVE PHYSICAL MEMORY UBIQUITY WITHIN NETWORKED CLUSTERS OF VIRTUAL MACHINES BY MICHAEL R. HINES B.S., Johns Hopkins University, 2003 M.S., Florida State University, 2005 DISSERTATION Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Computer Science in the Graduate School of Binghamton University State University of New York 2009 c Copyright by Michael R. Hines 2009 All Rights Reserved Accepted in partial fullfillment of the requirements for the degree of Doctor of Philosophy in Computer Science in the Graduate School of Binghamton University State University of New York 2009 July 31st, 2009 Dr. Kartik Gopalan, Department of Computer Science, Binghamton University Prof. Kanad Ghose, Department of Computer Science, Binghamton University Dr. Kenneth Chiu, Department of Computer Science, Binghamton University Dr. Kobus van der Merwe, AT&T Labs Research, Florham Park, NJ. iii ABSTRACT This dissertation addresses the use of distributed memory to improve the performance of state-of-the-art virtual machines (VMs) in clusters with gigabit interconnects. Even with ever-increasing DRAM capacities, we observe a continued need to support applications that exhibit mostly memory-intensive execution patterns, like databases, webservers, sci- entific and grid applications. In this dissertation, we make four primary contributions. First, we fully survey the history of the solutions available for basic, transparent distributed memory support. Then, we document a bottom-up implementation and evaluation of a basic prototype whose goal is to move deeper into the kernel than previous application-level solutions. We choose a clean, transparent device interface capable of minimizing network latency and copying overheads. Second, we explore how recent work with VMs has brought back into question the memory management logic of the operating system. VM technology provides ease and transparency for imposing order on OS memory management (using techniques like full virtualization and para-virtualization). As such, we evaluate distributed memory in this context by trying to optimize our previous prototype at different places in the Xen virtualization architecture. Third, we leverage this work to explore alternative strategies for live VM migration. A key component that determines the success of migration techniques has been exactly how memory is transmitted and when. More specifically, this involves fine grained page-fault management either before a VM’s CPU state is migrated (the current default) or afterwards. Thus, we design and evaluate the Post-Copy live VM migration scheme and compare it to the existing (Pre-Copy) migration scheme, realizing significant improvements. Finally, we promote the ubiquity of individual page frames as a cluster resource by integrating the use of distributed memory into the hypervisor (or virtual machine monitor). We design and implement CIVIC: a system that allows un-modified VMs to oversubscribe their DRAM size to larger than a given host’s physical memory. Then, we compliment this by implementing and evaluating network paging in the hypervisor for lo- cally resident VMs. We evaluate the performance impact of CIVIC on various application workloads and show how CIVIC allows for many possible VM extensions such as better VM consolidation, multi-host caching, and the ability to better coordinate with VM migration. iv ACKNOWLEDGEMENTS First, I would like to thank a few organizations responsible for providing invaluable sources of funding which allowed me to work through graduate school. The AT&T Labs Research Fellowship Program, in cooperation with Kobus van der Merwe in New Jersey, provided support for a full 3 years. The Clark fellowship program at SUNY Binghamton also provided a full year of funding. The department of Computer Science at both Florida State and Binghamton made teaching assistantships available for a year. These deeds often go unsaid - without them I would not have been able to complete this degree. I would also like to thank the National Science Foundation and the Computing Innovation Fellows Project (cifellows.org). Through them, I will be continuing on an assistantship as a post-doctoral fellow for the next year. My advisor deserves his own paragraph. Not many graduate students can say what I can: I have one of the greatest advisors on the planet. Six years ago, he took a chance on me and stood patiently through the entire process: through the transfers, the applications, the bad papers, the good papers, the leaps of faith, the happy accomplishments, and the sad ones. Not only is a he a fantastic researcher, but he is a strong teacher. I am very proud to be his student and I know many other students will be as well. v DEDICATION To my father: for his unconditional support, and love. And for all our tribulations. To my mother: for her strength, wisdom, and love. And for all of our struggles. To my brother: for his continuous perseverence and happiness. To my extended family: I stand on your shoulders. vi BIOGRAPHICAL SKETCH Michael R. Hines was born and raised in Dallas, Texas in 1983 and grew up playing classical piano. He began college at a program called the Texas Academy of Math and Science at the University of North Texas. 2 years later he transferred to Johns Hopkins University in Baltimore, Maryland and received his Bachelor of Science degree in Com- puter Science in 2003. Subsequently, he entered Florida State University to complete an Information Security certification in 2004 and a Masters degree in Computer Science in 2005. Immediately after that, he transferred to SUNY Binghamton University in New York state to finish working on a PhD degree in Computer Science in 2009. Michael will begin post-doctoral research at Columbia University in late 2009. He is a recipient of multiple awards, including the Jackie Robinson Undergraduate Scholarship (2yrs), the AT&T Labs Foundation Fellowship (3yrs), the Clark D. Gifford Fellowship (1yr) from Binghamton University, and the CIFellows CRA/NSF Award (1yr) for Post-Doctoral Research. He is a member of the academic honor societies Alpha Lambda Delta, Phi Eta Sigma and the Computer Science honor society Upsilon Pi Epsilon. His hobbies include billiards, skateboarding, and yo-yos. vii Contents List of Figures xiii List of Tables xviii 1 Introduction and Outline 1 1.1 Distributed Memory Virtualization in Networked Clusters . .3 1.2 Virtual Machine Based Use for Distributed Memory . .3 1.3 Improvement of Live Migration for Virtual Machines . .4 1.4 VM Memory Over-subscription with Network Paging . .5 2 Area Survey 7 2.1 Distributed Memory Systems . .7 2.1.1 Basic Distributed Memory (Anemone) . .8 2.1.2 Software Distributed Shared Memory . .9 2.2 Virtual Machine Technology and Distributed Memory . 10 2.2.1 Microkernels . 10 2.2.2 Modern Hypervisors . 11 2.3 VM Migration Techniques . 13 2.3.1 Process Migration . 13 2.3.2 Pre-Paging . 14 2.3.3 Live Migration . 15 viii 2.3.4 Non-Live Migration . 15 2.3.5 Self Ballooning . 16 2.4 Over-subscription of Virtual Machines . 16 3 Anemone: Distributed Memory Access 18 3.1 Introduction . 18 3.2 Design & Implementation . 20 3.2.1 Client and Server Modules . 23 3.2.2 Remote Memory Access Protocol (RMAP) . 24 3.2.3 Distributed Resource Discovery . 27 3.2.4 Soft-State Refresh . 27 3.2.5 Server Load Balancing . 28 3.2.6 Fault-tolerance . 28 3.3 Evaluation . 29 3.3.1 Paging Latency . 30 3.3.2 Application Speedup . 32 3.3.3 Tuning the Client RMAP Protocol . 36 3.3.4 Control Message Overhead . 37 3.4 Summary . 38 4 MemX: Virtual Machine Uses of Distributed Memory 39 4.1 Introduction . 39 4.2 Split Driver Background . 41 4.3 Design and Implementation . 43 4.3.1 MemX-Linux: MemX in Non-virtualized Linux . 44 4.3.2 MemX-DomU (Option 1): MemX Client Module in DomU . 46 4.3.3 MemX-DD (Option 2): MemX Client Module in Driver Domain . 48 4.3.4 MemX -Dom0: (Option 3) . 51 4.3.5 Alternative Options . 51 4.3.6 Network Access Contention: . 52 4.4 Evaluation . 53 ix 4.4.1 Latency and Bandwidth Microbenchmarks . 54 4.4.2 Application Speedups . 61 4.4.3 Multiple Client VMs . 63 4.4.4 Live VM Migration . 65 4.5 Summary . 65 5 Post-Copy: Live Virtual Machine Migration 67 5.1 Introduction . 68 5.2 Design . 70 5.2.1 Pre-Copy . 71 5.2.2 Design of Post-Copy Live VM Migration . 73 5.2.3 Prepaging Strategy . 76 5.2.4 Dynamic Self-Ballooning . 78 5.2.5 Reliability . 80 5.2.6 Summary . 81 5.3 Post-Copy Implementation . 82 5.3.1 Page-Fault Detection . 83 5.3.2 MFN exchanging . 85 5.3.3 Xen Daemon Modifications . 86 5.3.4 VM-to-VM kernel-to-kernel Memory-Mapping . 88 5.3.5 Dynamic Self Ballooning Implementation . 89 5.3.6 Proactive LRU Ordering to Improve Reference Locality . 92 5.4 Evaluation . 93 5.4.1 Stress Testing . 94 5.4.2 Degradation, Bandwidth, and Ballooning . 98 5.4.3 Application Scenarios . 104 5.4.4 Comparison of Prepaging Strategies . 107 5.5 Summary . 109 6 CIVIC: Transparent Over-subscription of VM Memory 110 6.1 Introduction . 111 x 6.2 Design . 113 6.2.1 Hypervisor Memory Management . 113 6.2.2 Shadow Paging Review . 115 6.2.3 Step 1: CIVIC Memory Allocation, Caching Design . 116 6.2.4 Step 2: Paging Communication and The Assistant . 118 6.2.5 Future Work: Page Migration, Sharing and Compression . 124 6.3 Implementation . 127 6.3.1 Address Space Expansion, BIOS Tables . 127 6.3.2 Communication Paths . 129 6.3.3 Cache Eviction and Prefetching .

Load more