Virtualization Techniques for Memory Resource Exploitation

Virtualization techniques for memory resource exploitation Luis Angel Garrido Platero ADVERTIMENT La consulta d’aquesta tesi queda condicionada a l’acceptació de les següents condicions d'ús: La difusió d’aquesta tesi per mitjà del r e p o s i t o r i i n s t i t u c i o n a l UPCommons (http://upcommons.upc.edu/tesis) i el repositori cooperatiu TDX ( h t t p : / / w w w . t d x . c a t / ) ha estat autoritzada pels titulars dels drets de propietat intel·lectual únicament per a usos privats emmarcats en activitats d’investigació i docència. No s’autoritza la seva reproducció amb finalitats de lucre ni la seva difusió i posada a disposició des d’un lloc aliè al servei UPCommons o TDX. No s’autoritza la presentació del seu contingut en una finestra o marc aliè a UPCommons (framing). Aquesta reserva de drets afecta tant al resum de presentació de la tesi com als seus continguts. En la utilització o cita de parts de la tesi és obligat indicar el nom de la persona autora. ADVERTENCIA La consulta de esta tesis queda condicionada a la aceptación de las siguientes condiciones de uso: La difusión de esta tesis por medio del repositorio institucional UPCommons (http://upcommons.upc.edu/tesis) y el repositorio cooperativo TDR (http://www.tdx.cat/?locale- attribute=es) ha sido autorizada por los titulares de los derechos de propiedad intelectual únicamente para usos privados enmarcados en actividades de investigación y docencia. No se autoriza su reproducción con finalidades de lucro ni su difusión y puesta a disposición desde un sitio ajeno al servicio UPCommons No se autoriza la presentación de su contenido en una ventana o marco ajeno a UPCommons (framing). Esta reserva de derechos afecta tanto al resumen de presentación de la tesis como a sus contenidos. En la utilización o cita de partes de la tesis es obligado indicar el nombre de la persona autora. WARNING On having consulted this thesis you’re accepting the following use conditions: Spreading this thesis by the i n s t i t u t i o n a l r e p o s i t o r y UPCommons (http://upcommons.upc.edu/tesis) and the cooperative repository TDX (http://www.tdx.cat/?locale- attribute=en) has been authorized by the titular of the intellectual property rights only for private uses placed in investigation and teaching activities. Reproduction with lucrative aims is not authorized neither its spreading nor availability from a site foreign to the UPCommons service. Introducing its content in a window or frame foreign to the UPCommons service is not authorized (framing). These rights affect to the presentation summary of the thesis as well as to its contents. In the using or citation of parts of the thesis it’s obliged to indicate the name of the author. Virtualization Techniques for Memory Resource Exploitation Luis Angel Garrido Platero Advisor: Paul Carpenter Co-advisor: Rosa Mar´ıaBad´ıa Computer Architecture Department Universitat Politècnicade Catalunya A thesis submitted for the degree of Doctor of Philosophy Acta de qualificacióde tesi doctoral Curs acadèmic:2018-2019 Nom i cognoms: Luis Angel Garrido Platero Programa de doctorat: Arquitectura de Computadors Unitat estructural responsable del programa: Departament d'Arquitectura de Computadors Resoluciódel Tribunal Reunit el Tribunal designat a l'efecte, el doctorand exposa el tema de la seva tesi doctoral titulada Virtualization Techniques for Memory Resource Exploitation. Acabada la lectura i desprésde donar resposta a les qüestionsformulades pels membres titulars del tribunal, aquest atorga la qualificació: NO APTE APROVAT NOTABLE EXCEL·LENT (Nom, cognoms i signatura) (Nom, cognoms i signatura) President/a Secretari/ària (Nom, cognoms (Nom, cognoms (Nom, cognoms i signatura) i signatura) i signatura) Vocal Vocal Vocal Barcelona, d'/de de El resultat de l'escrutini dels vots emesos pels membres titulars del tribunal, efectuat per l'Escola de Doctorat, a instànciade la Comissióde Doctorat de la UPC, atorga la MENCIO´ CUM LAUDE: S´ı NO (Nom, cognoms i signatura) (Nom, cognoms i signatura) President de la Comissió Permanent de Secretari de la Comissió Permanent de l'Escola de Doctorat l'Escola de Doctorat Barcelona, d'/de de Abstract Cloud infrastructures have become indispensable in our daily lives with the rise of cloud-based services offered by companies like Facebook, Google, Amazon and many others. The e-mails we receive everyday, the files and pictures we upload/view to/from social media and all the different market-places through which we make purchases daily rely in some way on services offered by a cloud infrastructure somewhere. These cloud infrastructures use a large numbers of servers provisioned with their own computing resources. Each of these servers use a piece of software, called the Hypervisor (\HV", for short), that allows them to create multiple virtual instances of a subset of their physical computing resources and abstract them into \Virtual Machines" (VMs). The hypervisor multiplexes among the VMs some of the physical resources of the server like CPU and I/O interfaces. But other resources, like memory, are allocated because they cannot be multiplexed over time in the same way. A VM for cloud-based services runs an Operating System (OS), which in turn runs the applications used by the service consumers. The administrator of the cloud infrastructure is responsible for managing the VMs. The management tasks involve determining the resources that the VMs can use, in which physical node they can execute and any other constraint the administrator deems reasonable to enforce, such as Service Level Agreements (SLA) [50, 22]. The VMs within the computing nodes generate varying memory demand be- havior. When memory utilization reaches its limits due to this demand, costly operations such as (virtual) disk accesses and/or VM migrations can occur. As a result, it is necessary to optimize the utilization of the local memory resources within a single computing node. However, pressure on the memory resources can still increase, making it necessary to move (migrate) the VM to a different node with larger memory or add more memory to the same node. At this point, it is important to consider that some of the nodes in the cloud infrastructure might have memory resources that they are not using. So, this opens up the possibility of making this memory available to the nodes that actually need it, avoiding costly migrations or hardware memory enhancements. Considering the possibility to have a node's memory made available to the rest of the nodes in the infrastructure, new architectures have been introduced that provide hardware support for a shared global address space across multiple computing nodes. Together with fast interconnects, these architectures enable the sharing of memory resources across computing nodes while enforcing coherence locally within nodes. In this context, every node is a coherence island and the memory is a global resource. This thesis presents multiple contributions to the memory management problem. First, it addresses the problem of optimizing memory resources in a virtualized node through different types of memory abstractions. Two full contributions are presented for managing memory within a single node called SmarTmem and CARLEMM (Continuous-Action Reinforcement Learning for Memory Manage- ment). In this respect, a third contribution is also presented, called CAVMem (Continuous-Action Algorithm for Virtualized Memory Management), that works as the foundation for CARLEMM. SmarTmem is designed to optimize the memory made available through the VMs through the Transcendent Memory (Tmem) abstraction, while CARLEMM tar- gets the optimization of memory made available through ballooning. The initia- tive behind CARLEMM is initially based on a proof-of-concept mechanism called CAVMem that provided insight on the use of Continuous Action Reinforcement Learning (RL) algorithms for memory management. To the best of our knowledge, CARLEMM and CAVMem are the first mechanisms to employ CARL for the problem of memory management within a virtualized node. Second, this thesis presents two contributions for memory capacity aggregation across multiple nodes, offering two mechanisms called GV-Tmem (Globally Vis- ible Transcendent Memory) and vMCA (Virtualized Memory Capacity Aggrega- tion), this latter being based on GV-Tmem but with significant enhancements. These mechanisms focus primarily on distributing the memory pool of the infrastructure across the computing nodes using a user-space process with high-level memory management policies. To summarize, the main contributions of this thesis work are: • A mechanism called SmarTmem for the optimal allocation of Tmem in a single virtualized node. • A mechanism called CAVMem (Continuous Action Algorithm for Virtual- ized Memory Management) developed as a proof-of-concept for the use of Continuous-Action Reinforcement Learning for RAM allocation in a simu- lation of a virtualized node • A mechanism called CARLEMM (Continuous Action Reinforcement Learn- ing for Memory Management), based on the proof-of-concept provided by CAVMem, but enhanced to solve the memory allocation problem in a real computing node. • A mechanism called GV-Tmem (Globally Visible Tmem) for remote memory capacity sharing across multiple computing nodes in a cloud infrastructure • A mechanism called vMCA (virtualized Memory Capacity Aggregation) for the sharing of memory capacity across nodes, but enhanced for resiliency and efficiency in comparison to GV-Tmem. Acknowledgements I would like to thank my advisor Paul Carpenter, which throughout this time has transmitted to me a lot of knowledge regarding our research field and a lot of knowledge on the methodology of research. I would like to thank him for his time and dedication to our work. I would also like to thank Rosa Badia, my co-advisor, for being accessible in par with Paul during all this time.

Load more