UNIVERSITAT ROVIRA I VIRGILI ENHANCING THE PROGRAMMABILITY OF CLOUD OBJECT STORAGE Josep Sampé Domenech Enhancing the Programmability of Cloud Object Storage JOSEP SAMPÉ DOMENECH DOCTORAL THESIS 2018 UNIVERSITAT ROVIRA I VIRGILI ENHANCING THE PROGRAMMABILITY OF CLOUD OBJECT STORAGE Josep Sampé Domenech UNIVERSITAT ROVIRA I VIRGILI ENHANCING THE PROGRAMMABILITY OF CLOUD OBJECT STORAGE Josep Sampé Domenech UNIVERSITAT ROVIRA I VIRGILI ENHANCING THE PROGRAMMABILITY OF CLOUD OBJECT STORAGE Josep Sampé Domenech UNIVERSITAT ROVIRA I VIRGILI ENHANCING THE PROGRAMMABILITY OF CLOUD OBJECT STORAGE Josep Sampé Domenech Josep Sampé Domenech Enhancing the Programmability of Cloud Object Storage Doctoral Thesis Advised by Dr. Pedro García López Dr. Marc Sánchez Artigas Department of Computer Engineering and Mathematics Tarragona 2018 UNIVERSITAT ROVIRA I VIRGILI ENHANCING THE PROGRAMMABILITY OF CLOUD OBJECT STORAGE Josep Sampé Domenech UNIVERSITAT ROVIRA I VIRGILI ENHANCING THE PROGRAMMABILITY OF CLOUD OBJECT STORAGE Josep Sampé Domenech FAIG CONSTAR que aquest treball, titulat “Enhancing the Programmability of Cloud Object Storage”, que presenta Josep Sampé Domenech per a l’obtenció del títol de Doctor, ha estat realitzat sota la meva direcció al Departament d’Enginyeria Informàtica i Matemàtiques d’aquesta universitat i que compleix els requeriments per poder optar a la Menció Internacional. HAGO CONSTAR que el presente trabajo, titulado “Enhancing the Programma- bility of Cloud Object Storage”, que presenta Josep Sampé Domenech para la obtención del título de Doctor, ha sido realizado bajo mi dirección en el Departa- mento de Ingeniería Informática y Matemáticas de esta universidad y que cumple los requisitos para poder optar a la Mención Internacional. I STATE that the present study, entitled “Enhancing the Programmability of Cloud Object Storage”, presented by Josep Sampé Domenech for the award of the degree of Doctor, has been carried out under my supervision at the Department of Computer Engineering and Mathematics of this university, and that it fulfills all the requirements to be eligible for the International Doctorate Award. Tarragona, 20 de Setembre/20 de Septiembre/September 20, 2018 Els directors de la tesi doctoral Los directores de la tesis doctoral Doctoral thesis supervisors Dr. Pedro García López Dr. Marc Sánchez Artigas UNIVERSITAT ROVIRA I VIRGILI ENHANCING THE PROGRAMMABILITY OF CLOUD OBJECT STORAGE Josep Sampé Domenech UNIVERSITAT ROVIRA I VIRGILI ENHANCING THE PROGRAMMABILITY OF CLOUD OBJECT STORAGE Josep Sampé Domenech Acknowledgements This dissertation has been written during my time at the “Arquitectures i Serveis Telemàtics (AST)” research group at Universitat Rovira i Virgili. As such, in first place I would like to thank my advisors Dr. Pedro García López and Dr. Marc Sánchez Artigas their endless patience, help and advise through my formation as a researcher. Without their teamwork as advisors and their open-minded perspective of what is doing research, I would not have been able of completing this thesis. Also, I thank to all the people in the AST research group who shared these years with me. Specially, Raúl Gracia Tinedo for his help and guidance, and Gerard París, for the moments that we enjoyed together in the lab. Secondly, I would like to thank Ofer Biran, Gil Vernik and Dalit Naor for accepting me as an intern in their research group at IBM Haifa Research Labs. The months I spent in Israel were not only fruitful from a professional perspective, but also very enriching from a personal viewpoint. Last but not least, my deepest appreciation goes to my family and friends. I would like to specially thank my parents Amado and Teresa, my brother Xavi, his wife Montse, and the new member of my family, my nephew Aran, for their love and support. Josep Sampé Domenech Vilalba dels Arcs, September 2018 UNIVERSITAT ROVIRA I VIRGILI ENHANCING THE PROGRAMMABILITY OF CLOUD OBJECT STORAGE Josep Sampé Domenech This work has been partially funded by the European Union Horizon 2020 Framework Programme, in the context of the project IOStack: Software-defined Storage for Big Data (H2020-644182), and by the Spanish Ministry of Science and Innovation (Grant TIN2016-77836-C2-1-R). UNIVERSITAT ROVIRA I VIRGILI ENHANCING THE PROGRAMMABILITY OF CLOUD OBJECT STORAGE Josep Sampé Domenech Abstract In a world that is increasingly dependent on technology, digital data is generated in an unprecedented way. This makes companies that require large storage space, such as Netflix or Dropbox, use cloud storage solutions where data is remotely maintained, managed, and backed up, in an easy and cheap way. Particularly, cloud object stores are widely adopted and increasingly used for storing these huge amounts of data. This is mainly thanks to their built-in characteristics, such as simplicity, scalability and high-availability. Moreover, the evolution of cloud computing, in what refers, for example, to data analysis, make cloud object stores an important actor in today’s cloud ecosystem. However, cloud object stores face three main challenges: 1) Flexible management of multi-tenant workloads. Commonly, cloud object stores are multi-tenant sys- tems, meaning that all tenants share the same system resources, which could lead to interference problems. Furthermore, it is now complex to manage heteroge- neous storage policies in a massive scale. 2) Data self-management. Cloud object stores themselves do not offer much flexibility regarding data self-management by tenants. Typically, they are rigid, non-programmable systems, which prevent tenants to handle the specific requirements of their objects. 3) Elastic compu- tation close to the data. Placing computations close to the data in the storage system can be useful to reduce data transfers. But, the challenge here is how to achieve elasticity in those computations without provoking resource contention and interferences in the storage layer. In this thesis, we present three novel research contributions that solve the afore- mentioned challenges. Firstly, we introduce the first Software-defined Storage (SDS) architecture for cloud object stores that separates the control plane from the data plane, allowing to manage multi-tenant workloads in a flexible and dynamic way. For example, by applying different service levels of bandwidth to different tenants. Secondly, we designed a novel policy abstraction called microcontroller that transforms common objects into smart objects, enabling tenants to programmatically manage their behavior. For example, a content-level access control microcontroller attached to an specific object to filter its content depending on who is accessing it. Finally, we present the first elastic data-driven serverless computing platform that
