Týr: Storage-Based HPC and Big Data Convergence Using Transactional

Týr: Storage-Based HPC and Big Data Convergence Using Transactional

Programa de Doctorado de Inteligencia Artificial Escuela T´ecnicaSuperior de Ingenieros Inform´aticos PhD Thesis T´yr:Storage-Based HPC and Big Data Convergence Using Transactional Blobs Author: Pierre Matri Supervisors: Prof. Dra. Mar´ıaS. P´erezHern´andez Prof. Dr. Gabriel Antoniu June, 2018 ii Tribunal nombrado por el Sr. Rector Magfco. de la Universidad Polit´ecnicade Madrid, el d´ıa11 de Junio de 2018. Presidente: Dr. Antonio P´erezAmbite Vocal: Dr. Jes´usCarretero P´erez Vocal: Dr. Antonio Cort´esRosell´o Vocal: Dr. Christian P´erez Secretario: Dr. Alberto S´anchez Campos Suplente: Dr. Antonio Garc´ıaDopico Suplente: Dr. Alejandro Calder´onMateos Realizado el acto de defensa y lectura de la Tesis el d´ıa11 de Junio de 2018 en la Escuela T´ecnicaSuperior de Ingenieros Inform´aticos Calificaci´on: EL PRESIDENTE VOCAL 1 VOCAL 2 VOCAL 3 EL SECRETARIO iii iv To Dimitris. To my family, who make it all worthwile. v vi Agradecimientos My very first comments will be directed to my three tremendous advisors, Mar´ıa, Alexandru and Gabriel. You have set an example of excellence as researchers, mentors, instructors, and tactful moderators of my wildest ideas. It is needless to say that this thesis would not have been possible without your continuous support, guidance and wisdom. I could not thank enough my fellow Ph.D students from the BigStorage project. I cannot express how greatful I am for having been awesome col- leagues, and above all awesome friends. Thank you all. Alvaro,´ Danilo, Dimitris, Eugene, Georgios, Fotis(es), Linh, Michal,Nafiseh, Ovidiu, Thanos, Umar, Yacine, you all helped in your own way making these three years an unforgettable experience, hence significantly contributing to this thesis. Many thanks to the members of the Ontology Engineering Group, who were here to welcome me, share support, bring tasty cakes, and force me to learn Spanish all along the way. For all this, thank you Ahmad, Daniel, Freddy, Idafen, Julia, Mar´ıa,Miguel Angel,´ Nelson, Pablo, Ra´ul,and everybody I don't have the space to name. A special thanks goes to Jose Angel´ and Ana for helping me through the endless admistrative dedalus. Similar comments goes to the tremendous KerData team I first started research with in Rennes, and with whom I finish writing this manuscript. Luis, Hadi, Nathana¨el,Matthieu, Radu, Luc. You did definitely contribute to this thesis, all in your own way, around lunches, coffee and team meetings. I hereby devise my beloved coffee machine to the team. Thanks to Aur´elie for her administrative support that prevented precocious hair loss. I want to also thanks Rob Ross and Phil Carns for warmly welcoming me at Argonne National Laboratory during three summer months. Besides being a fantastic research environment, I could also share precious time with awesome people I simply cannot forget acknowledging in here. Let me switch to french for one paragraph. Je tiens bien ´evidemment `a remercier ma famille, pour leur soutien inconditionnel qui m'a toujours Figure 1: Coffee Distributor coffee Tea Empirically- 60:0 determined pro- ductivity estimation 40:0 relative to the avail- able coffee supply 20:0 around the office, with 95% confidence 0:0 intervals. Productivity (words /− min) 20:0 permis de me consacrer `ala r´ealisationde mes r^eves. Parce-que m^eme au loin vous ^etestoujours l`a,vous me supportez depuis toujours, et vous avez toujours cru en moi. Merci maman, papa. Merci C´eline,Fran¸coiset Caroline. Merci Val´ery, Marie et Arthur. Merci Sarah. Merci Gabriel. I need to dedicate a paragraph for all those who willingly or unwillingly contributed to this thesis. In no particular order, thanks to the reviewers of VLDB, CCGrid and Cluster that, by rejecting my contributions, led me to seek for the better. Thanks to Iberia, whose incalculable delays developed my ninja writing skills in airports around the world. Thanks to all the worldwide coffee supply chain, whose hard work and effective caffeine supported my productivity all along the way, as plotted in Figure 1. I want to finish with a very special mention for Dimitris, who left us way too soon, way too young, and to whom I dedicate this thesis. Dimitris, I could not thank you enough for being one of the most inspiring and cheerful souls I have ever known. I could not thank you enough for your courage and inspiration. I could not thank you enough for the laughter and the beers. I could not thank you enough for being you. I simply could not thank you enough. May you rest in peace, with all my love, my wishes and my thoughts. Abstract The increasingly growing data sets processed on HPC platforms raise major challenges for the underlying storage layer. A promising alternative to tra- ditional file-based storage systems are simpler blobs (binary large objects). They offer lower overhead and better performance at the cost of largely unused features such as file hierarchies or permissions. In a similar fashion, blobs are increasingly considered for replacing distributed file systems for Big Data Analytics (BDA) or as a base for storage abstractions like key- value stores or time-series databases. From these observations we advocate that blobs provide a solid storage model for convergence between HPC and BDA platforms. We identify data consistency as a hard problem to solve in this context because of the different choices made by both communities: while BDA developers typically rely on the storage system to provide data access coordination, the lack of such semantics on HPC platforms requires developers to use application-level tools for this task. In this thesis we pro- pose the key design principles of T´yr,a converging storage system designed to answer the needs of both HPC and BDA applications, natively offering data access coordination in the form of transactions. We demonstrate the relevance and efficiency of its design in the light of convergence in multiple applicative contexts from both communities. These experiments validate that T´yrdelivers its promise of high-throughput and versatility, hence fu- eling storage-based convergence between HPC and BDA. ix x Resumen La creciente cantidad de datos procesados en plataformas HPC supone un reto para el sistema de almacenamiento subyacente. Una alternativa prom- etedora a los sistemas de almacenamiento basado en ficheros es el uso de BLOBs (Binary Large OBjects). Esta alternativa ofrece menor sobrecarga y mejor rendimiento a cambio de eliminar caracter´ısticast´ıpicasde los sis- temas de ficheros, como la jerarqu´ıaen forma de directorios o los permisos. De manera an´aloga,los blobs pueden utilizarse para reemplazar sistemas de ficheros en el ´areade Big Data Analytics (BDA) o como base para otras abstracciones de almacenamiento, tales como bases de datos clave-valor o de series de tiempo. A partir de estas observaciones, podemos concluir que los blobs son un modelo de almacenamiento s´olidopara lograr la convergen- cia entre plataformas HPC y BDA. En este contexto, uno de los problemas cr´ıticosque hay que resolver es la consistencia de los datos, debido a los diferentes elecciones de cada una de las comunidades: mientras que los desarrolladores de BDA delegan habitualmente la responsabilidad de coor- dinar el acceso a los datos al sistema de almacenamiento, la falta de dicha capacidad en las plataformas HPC requiere que los desarrolladores tengan que utilizar herramientas a nivel de aplicaci´onpara realizar esta tarea. Esta tesis propone los principios de dise~noprincipales de T´yr,un sistema de al- macenamiento convergente dise~nadopara responder a las necesidades de aplicaciones HPC y BDA, ofreciendo de forma nativa la coordinaci´onen el acceso a los datos en forma de transacciones. La tesis demuestra la rele- vancia y eficiencia de este dise~noaplicado a m´ultiplesescenarios de ambos campos. Los experimentos implementados muestran las caracter´ısticasde rendimiento y versatilidad ofrecidas por T´yr,lo que supone un importante impulso para lograr la deseada convergencia entre HPC y BDA. xi xii Contents 1 Introduction 1 1.1 Objectives . .3 1.2 Contributions . .4 1.3 Publications . .5 1.4 Organization of the manuscript . .7 I Context: distributed storage, consistency for HPC and Big Data 9 2 HPC and BDA: Divergent stacks, convergent storage needs 11 2.1 Comparative overview of HPC and BDA stacks . 12 2.2 HPC: Centralized, file-based storage . 14 2.3 BDA: Modular, application-purpose storage . 15 2.4 Conclusions: challenges of storage convergence between HPC and BDA 16 3 Distributed storage paradigms 17 3.1 Distributed file systems . 18 3.2 Key-value stores . 22 3.3 Document, columnar, graph databases . 24 3.4 Blob storage . 27 3.5 Conclusions: which storage paradigm for converging storage? . 30 4 Consistency management 31 4.1 Storage consistency models . 32 4.2 Application-specific consistency management . 34 4.3 Transactional semantics . 36 xiii 4.4 CAP: Consistency, Availability, Partition tolerance . 40 4.5 Conclusions: which consistency model for converging storage? . 41 II T´yr:a transactional, scalable data storage system 43 5 Blobs as storage model for convergence 45 5.1 General overview, intuition and methodology . 45 5.2 Storage call distribution for HPC and BDA applications . 49 5.3 Replacing file-based by blob-based storage . 52 5.4 Conclusion: blob is the right storage paradigm for convergence . 58 6 T´yrdesign principles and architecture 61 6.1 Key design principles . 62 6.2 Predictable data distribution . 63 6.3 Transparent multi-version concurrency control . 64 6.4 ACID transactional semantics . 66 6.5 Atomic transform operations . 67 7 T´yrprotocols 71 7.1 Lightweight transaction protocol . 72 7.2 Handling reads: direct, multi-chunk and transactional protocols . 74 7.3 Handling writes: transactional protocol, atomic transforms .

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    193 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us