Younge Phd-Thesis.Pdf

ARCHITECTURAL PRINCIPLES AND EXPERIMENTATION OF DISTRIBUTED HIGH PERFORMANCE VIRTUAL CLUSTERS by Andrew J. Younge Submitted to the faculty of Indiana University in partial fulfillment of the requirements for the degree of Doctor of Philosophy Department of Computer Science Indiana University October 2016 Copyright c 2016 Andrew J. Younge All Rights Reserved INDIANA UNIVERSITY GRADUATE COMMITTEE APPROVAL of a dissertation submitted by Andrew J. Younge This dissertation has been read by each member of the following graduate committee and by majority vote has been found to be satisfactory. Date Geoffrey C. Fox, Ph.D, Chair Date Judy Qiu, Ph.D Date Thomas Sterling, Ph.D Date D. Martin Swany, Ph.D ABSTRACT ARCHITECTURAL PRINCIPLES AND EXPERIMENTATION OF DISTRIBUTED HIGH PERFORMANCE VIRTUAL CLUSTERS Andrew J. Younge Department of Computer Science Doctor of Philosophy With the advent of virtualization and Infrastructure-as-a-Service (IaaS), the broader scientific computing community is considering the use of clouds for their scientific computing needs. This is due to the relative scalability, ease of use, advanced user environment customization abilities, and the many novel computing paradigms available for data-intensive applications. However, a notable performance gap exists between IaaS and typical high performance computing (HPC) resources. This has limited the applicability of IaaS for many potential users, not only for those who look to leverage the benefits of virtualization with traditional scientific computing applications, but also for the growing number of big data scientists whose platforms are unable to build on HPCs advanced hardware resources. Concurrently, we are at the forefront of a convergence in infrastructure between Big Data and HPC, the implications of which suggest that a unified distributed computing architecture could provide computing and storage capabilities for both differing distributed systems use cases. This dissertation proposes such an endeavor by leveraging the performance and advanced hardware from the HPC community and providing it in a virtualized infrastructure using High Performance Virtual Clusters. This will not only enable a more diverse user environment within supercomputing applications, but also bring increased performance and capabilities to big data platform services. The project begins with an evaluation of current hypervisors and their via- bility to run HPC workloads within current infrastructure, which helps define existing performance gaps. Next, mechanisms to enable the use of specialized hardware available in many HPC resources are uncovered, which include advanced accelerators like the Nvidia GPUs and high-speed, low-latency In- finiBand interconnects. The virtualized infrastructure that developed, which leverages such specialized HPC hardware and utilizes best-practices in virtualization using KVM, supports advanced scientific computations common in today`s HPC systems. Specifically, we find that example Molecular Dynamics simulations can run at near-native performance, with only a 1-2% overhead in our virtual cluster. These advances are incorporated into a framework for constructing distributed virtual clusters using the OpenStack cloud infrastructure. With high performance virtual clusters, we look to support a broad range of scientific computing challenges, from HPC simulations to big data analytics with a single, unified infrastructure. ABSTRACT APPROVAL As committee members of the candidate's graduate committe, I have read the dissertation abstract of Andrew J. Younge and found it in a form satisfactory to the graduate committee. Date Geoffrey C. Fox, Ph.D, Chair Date Judy Qiu, Ph.D Date Thomas Sterling, Ph.D Date D. Martin Swany, Ph.D ACKNOWLEDGMENTS I am grateful to many, many people for making this dissertation possible. First, I want to express my sincere gratitude and admiration to my advisor, Geoffrey Fox. Geoffrey has provided invaluable advice, knowledge, insight, and direction in a way that made this dissertation possible. It has been a distinct honor and privilege working with him, something which I will carry with me throughout the rest of my career. I would like to thank my committee members, Judy Qiu, Thomas Sterling, and Martin Swany. The guidance received in the formation of this dissertation has been of great help and a fruitful learning experience. Conversations with each member throughout the dissertation writing have proven exceptionally insightful, and I have learned a great deal working with each of them. I also want to explicitly thank Judy Qiu for giving me the opportunity to help organize and lecture in the Distributed Systems and Cloud Computing courses over the years. This teaching experience has taught me how to be both a better lecturer and mentor. Over the years I have had the pleasure of working with a number of members in the Digital Science Center lab and others throughout Indiana Uni- versity. I would like to thank Javier Diaz-Montes, Thilina Ekanayaka, Saliya Ekanayaka, Xiaoming Gao, Robert Henschel, Supun Kamburugamuve, Nate Keith, Gary Miksik, Jerome Mitchell, Julie Overfield, Greg Pike, Allan Streib, Koji Tanaka, Gregor von Laszewski, Fugang Wang, Stephen Wu, and Yudou viii Zhou for their support over the years. Many of these individuals have helped me directly throughout my tenure at Indiana University and their efforts and encouragement is greatly appreciated. I would like to express my gratitude to the APEX research group at the University Of Southern California Information Sciences Institute, where I was a visiting researcher and intern in 2012 and 2013. In this, I would like to give a special thanks to John Paul Walters at USC/ISI who worked directly with me on many of the research endeavors described within this dissertation. He has provided hands-on mentorship that aided with the basis of this dissertation, and his leadership is one in which I look to emulate in the future. I would also like to thank the Persistent Systems Fellowship through the School of Informatics and Computing for providing me with a fellowship to fund the last three years of my PhD. I would also like to give thanks to the many close friends and roommates in Bloomington, Indiana over the years. Specifically, Id like to thank Ahmed El-Hassany, Jason Hemann, Matthew Jaffee, Aubrey Kelly, Haley MacLeod, Brent McGill, Andrea Sorensen, Dane Sorensen, Larice Thoroughgood, and Miao Zhang. Id like to make a special acknowledgement to my friends in the Bloomington cycling community, many of whom have become my Bloomington pseudo-family. This includes John Brooks, Michael Wallis Jr, Kasey Poracky, Brian Gessler, Niki Gessler, Davy McDonald, Jess Bare, Ian Yarbrough, Kelsey Thetonia, Rob Slover, Pete Stuttgen, Mike Fox, Tom Schwoegler, and the Chi Omega Little 500 cycling team. In this, I specifically need to acknowledge two wonderful people, Larice thoroughgood and John Brooks for their support, patience, and forbearance during the final stages in of my Ph.D. Larice has provided the love and encouragement necessary to see this dissertation through ix to the end. John has donated countless hours of his time helping me edit and revise this manuscript, along with many after-hour philosophical discussions that proved crutial to the work at hand. Without all of these wonderful people in my life here in Bloomington, this dissertation would have likely been completed many months earlier. Last but not least, I want to thank my parents Mary and Wayne Younge, my sisters Bethany Younge and Elizabeth Keller, and my extended family for their unwavering love and support over the years. Words cannot fully express the gratitude and love I have for you all. Contents Table of Contentsx List of Figures xiii 1 Introduction1 1.1 Overview..................................1 1.2 Research Statement............................7 1.3 Research Challenges........................... 14 1.4 Outline................................... 17 2 Related Research 20 2.1 Virtualization............................... 20 2.1.1 Hypervisors and Containers................... 23 2.2 Cloud Computing............................. 25 2.2.1 Infrastructure-as-a-Service.................... 30 2.2.2 Virtual Clusters.......................... 40 2.2.3 The FutureGrid Project..................... 44 2.3 High Performance Computing...................... 47 2.3.1 Brief History of Supercomputing................ 47 2.3.2 Distributed Memory Computation................ 49 2.3.3 Exascale.............................. 51 3 Analysis of Virtualization Technologies for High Performance Com- puting Environments 54 3.1 Abstract.................................. 54 3.2 Introduction................................ 55 3.3 Related Research............................. 57 3.4 Feature Comparison........................... 59 3.4.1 Usability.............................. 61 3.5 Experimental Design........................... 63 3.5.1 The FutureGrid Project..................... 63 3.5.2 Experimental Environment.................... 65 3.5.3 Benchmarking Setup....................... 65 x CONTENTS xi 3.6 Performance Comparison......................... 68 3.7 Discussion................................. 73 4 Evaluating GPU Passthrough in Xen for High Performance Cloud Computing 77 4.1 Abstract.................................. 77 4.2 Introduction................................ 78 4.3 Virtual GPU Directions......................... 80 4.3.1 Front-end Remote API invocation................ 81 4.3.2 Back-end PCI passthrough.................... 83 4.4 Implementation.............................

Younge Phd-Thesis.Pdf

A Case for High Performance Computing with Virtual Machines

Status Update on PCI Express Support in Qemu

CS3210: Booting and X86

Exploring Massive Parallel Computation with GPU

Linux Virtualization Update

Dcuda: Hardware Supported Overlap of Computation and Communication

Novel Memory and I/O Virtualization Techniques for Next Generation Data-Centers Based on Disaggregated Hardware Maciej Bielski

On the Virtualization of CUDA Based GPU Remoting on ARM and X86 Machines in the Gvirtus Framework

Ubuntu Server Guide Basic Installation Preparing to Install

Kshot: Live Kernel Patching with SMM and SGX

QEMU Version 4.2.0 User Documentation I

Unprivileged GPU Containers on a LXD Cluster