Enhancing the Programmability of Cloud Object Storage

ENHANCING THE PROGRAMMABILITY OF CLOUD OBJECT STORAGE Josep Sampé Domenech ADVERTIMENT. L'accés als continguts d'aquesta tesi doctoral i la seva utilització ha de respectar els drets de la persona autora. Pot ser utilitzada per a consulta o estudi personal, així com en activitats o materials d'investigació i docència en els termes establerts a l'art. 32 del Text Refós de la Llei de Propietat Intel·lectual (RDL 1/1996). Per altres utilitzacions es requereix l'autorització prèvia i expressa de la persona autora. En qualsevol cas, en la utilització dels seus continguts caldrà indicar de forma clara el nom i cognoms de la persona autora i el títol de la tesi doctoral. No s'autoritza la seva reproducció o altres formes d'explotació efectuades amb finalitats de lucre ni la seva comunicació pública des d'un lloc aliè al servei TDX. Tampoc s'autoritza la presentació del seu contingut en una finestra o marc aliè a TDX (framing). Aquesta reserva de drets afecta tant als continguts de la tesi com als seus resums i índexs. ADVERTENCIA. El acceso a los contenidos de esta tesis doctoral y su utilización debe respetar los derechos de la persona autora. Puede ser utilizada para consulta o estudio personal, así como en actividades o materiales de investigación y docencia en los términos establecidos en el art. 32 del Texto Refundido de la Ley de Propiedad Intelectual (RDL 1/1996). Para otros usos se requiere la autorización previa y expresa de la persona autora. En cualquier caso, en la utilización de sus contenidos se deberá indicar de forma clara el nombre y apellidos de la persona autora y el título de la tesis doctoral. No se autoriza su reproducción u otras formas de explotación efectuadas con fines lucrativos ni su comunicación pública desde un sitio ajeno al servicio TDR. Tampoco se autoriza la presentación de su contenido en una ventana o marco ajeno a TDR (framing). Esta reserva de derechos afecta tanto al contenido de la tesis como a sus resúmenes e índices. WARNING. Access to the contents of this doctoral thesis and its use must respect the rights of the author. It can be used for reference or private study, as well as research and learning activities or materials in the terms established by the 32nd article of the Spanish Consolidated Copyright Act (RDL 1/1996). Express and previous authorization of the author is required for any other uses. In any case, when using its content, full name of the author and title of the thesis must be clearly indicated. Reproduction or other forms of for profit use or public communication from outside TDX service is not allowed. Presentation of its content in a window or frame external to TDX (framing) is not authorized either. These rights affect both the content of the thesis and its abstracts and indexes. UNIVERSITAT ROVIRA I VIRGILI ENHANCING THE PROGRAMMABILITY OF CLOUD OBJECT STORAGE Josep Sampé Domenech Enhancing the Programmability of Cloud Object Storage JOSEP SAMPÉ DOMENECH DOCTORAL THESIS 2018 UNIVERSITAT ROVIRA I VIRGILI ENHANCING THE PROGRAMMABILITY OF CLOUD OBJECT STORAGE Josep Sampé Domenech UNIVERSITAT ROVIRA I VIRGILI ENHANCING THE PROGRAMMABILITY OF CLOUD OBJECT STORAGE Josep Sampé Domenech UNIVERSITAT ROVIRA I VIRGILI ENHANCING THE PROGRAMMABILITY OF CLOUD OBJECT STORAGE Josep Sampé Domenech UNIVERSITAT ROVIRA I VIRGILI ENHANCING THE PROGRAMMABILITY OF CLOUD OBJECT STORAGE Josep Sampé Domenech Josep Sampé Domenech Enhancing the Programmability of Cloud Object Storage Doctoral Thesis Advised by Dr. Pedro García López Dr. Marc Sánchez Artigas Department of Computer Engineering and Mathematics Tarragona 2018 UNIVERSITAT ROVIRA I VIRGILI ENHANCING THE PROGRAMMABILITY OF CLOUD OBJECT STORAGE Josep Sampé Domenech UNIVERSITAT ROVIRA I VIRGILI ENHANCING THE PROGRAMMABILITY OF CLOUD OBJECT STORAGE Josep Sampé Domenech FAIG CONSTAR que aquest treball, titulat “Enhancing the Programmability of Cloud Object Storage”, que presenta Josep Sampé Domenech per a l’obtenció del títol de Doctor, ha estat realitzat sota la meva direcció al Departament d’Enginyeria Informàtica i Matemàtiques d’aquesta universitat i que compleix els requeriments per poder optar a la Menció Internacional. HAGO CONSTAR que el presente trabajo, titulado “Enhancing the Programma- bility of Cloud Object Storage”, que presenta Josep Sampé Domenech para la obtención del título de Doctor, ha sido realizado bajo mi dirección en el Departa- mento de Ingeniería Informática y Matemáticas de esta universidad y que cumple los requisitos para poder optar a la Mención Internacional. I STATE that the present study, entitled “Enhancing the Programmability of Cloud Object Storage”, presented by Josep Sampé Domenech for the award of the degree of Doctor, has been carried out under my supervision at the Department of Computer Engineering and Mathematics of this university, and that it fulfills all the requirements to be eligible for the International Doctorate Award. Tarragona, 20 de Setembre/20 de Septiembre/September 20, 2018 Els directors de la tesi doctoral Los directores de la tesis doctoral Doctoral thesis supervisors Dr. Pedro García López Dr. Marc Sánchez Artigas UNIVERSITAT ROVIRA I VIRGILI ENHANCING THE PROGRAMMABILITY OF CLOUD OBJECT STORAGE Josep Sampé Domenech UNIVERSITAT ROVIRA I VIRGILI ENHANCING THE PROGRAMMABILITY OF CLOUD OBJECT STORAGE Josep Sampé Domenech Acknowledgements This dissertation has been written during my time at the “Arquitectures i Serveis Telemàtics (AST)” research group at Universitat Rovira i Virgili. As such, in first place I would like to thank my advisors Dr. Pedro García López and Dr. Marc Sánchez Artigas their endless patience, help and advise through my formation as a researcher. Without their teamwork as advisors and their open-minded perspective of what is doing research, I would not have been able of completing this thesis. Also, I thank to all the people in the AST research group who shared these years with me. Specially, Raúl Gracia Tinedo for his help and guidance, and Gerard París, for the moments that we enjoyed together in the lab. Secondly, I would like to thank Ofer Biran, Gil Vernik and Dalit Naor for accepting me as an intern in their research group at IBM Haifa Research Labs. The months I spent in Israel were not only fruitful from a professional perspective, but also very enriching from a personal viewpoint. Last but not least, my deepest appreciation goes to my family and friends. I would like to specially thank my parents Amado and Teresa, my brother Xavi, his wife Montse, and the new member of my family, my nephew Aran, for their love and support. Josep Sampé Domenech Vilalba dels Arcs, September 2018 UNIVERSITAT ROVIRA I VIRGILI ENHANCING THE PROGRAMMABILITY OF CLOUD OBJECT STORAGE Josep Sampé Domenech This work has been partially funded by the European Union Horizon 2020 Framework Programme, in the context of the project IOStack: Software-defined Storage for Big Data (H2020-644182), and by the Spanish Ministry of Science and Innovation (Grant TIN2016-77836-C2-1-R). UNIVERSITAT ROVIRA I VIRGILI ENHANCING THE PROGRAMMABILITY OF CLOUD OBJECT STORAGE Josep Sampé Domenech Abstract In a world that is increasingly dependent on technology, digital data is generated in an unprecedented way. This makes companies that require large storage space, such as Netflix or Dropbox, use cloud storage solutions where data is remotely maintained, managed, and backed up, in an easy and cheap way. Particularly, cloud object stores are widely adopted and increasingly used for storing these huge amounts of data. This is mainly thanks to their built-in characteristics, such as simplicity, scalability and high-availability. Moreover, the evolution of cloud computing, in what refers, for example, to data analysis, make cloud object stores an important actor in today’s cloud ecosystem. However, cloud object stores face three main challenges: 1) Flexible management of multi-tenant workloads. Commonly, cloud object stores are multi-tenant systems, meaning that all tenants share the same system resources, which could lead to interference problems. Furthermore, it is now complex to manage heteroge- neous storage policies in a massive scale. 2) Data self-management. Cloud object stores themselves do not offer much flexibility regarding data self-management by tenants. Typically, they are rigid, non-programmable systems, which prevent tenants to handle the specific requirements of their objects. 3) Elastic compu- tation close to the data. Placing computations close to the data in the storage system can be useful to reduce data transfers. But, the challenge here is how to achieve elasticity in those computations without provoking resource contention and interferences in the storage layer. In this thesis, we present three novel research contributions that solve the afore- mentioned challenges. Firstly, we introduce the first Software-defined Storage (SDS) architecture for cloud object stores that separates the control plane from the data plane, allowing to manage multi-tenant workloads in a flexible and dynamic way. For example, by applying different service levels of bandwidth to different tenants. Secondly, we designed a novel policy abstraction called microcontroller that transforms common objects into smart objects, enabling tenants to programmatically manage their behavior. For example, a content-level access control microcontroller attached to an specific object to filter its content depending on who is accessing it. Finally, we present the first elastic data-driven serverless computing platform that

Enhancing the Programmability of Cloud Object Storage

Respecting the Block Interface – Computational Storage Using Virtual Objects

Distributing an SQL Query Over a Cluster of Containers

Hypervisors Vs. Lightweight Virtualization: a Performance Comparison

Erlang on Physical Machine

IDC Marketscape IDC Marketscape: Worldwide Object-Based Storage 2019 Vendor Assessment

An Object Storage Solution for Data Archive Using Supermicro SSG-5018D8-AR12L and Openio SDS

Presto: the Definitive Guide

Unlock Modern Backup and Recovery with Object Storage

Retrospect Backup for Windows Data Protection for Businesses

Openio SDS on ARM a Practical and Cost-Eﬀective Object Storage Infrastructure Based on Soyoustart Dedicated ARM Servers

Next-Gen Object Storage and Serverless Computing Agenda

Containerized SQL Query Evaluation in a Cloud