User-Defined Execution Relaxations for Enhanced Programmability in High-Performance Parallel Computing

UNIVERSIDAD COMPLUTENSE DE MADRID FACULTAD DE INFORMATICA TESIS DOCTORAL User-defined execution relaxations for enhanced programmability in high-performance parallel computing Relajaciones de ejecución definidas por el usuario para la mejora de la programabilidad en computación paralela de altas prestaciones MEMORIA PARA OPTAR AL GRADO DE DOCTOR PRESENTADA POR Andrés Antón Rey Villaverde Directores Francisco Daniel Igual Peña Manuel Prieto Matías Madrid © Andrés Antón Rey Villaverde, 2019 User-defined Execution Relaxations for Enhanced Programmability in High-Performance Parallel Computing – Relajaciones de Ejecucion´ Definidas por el Usuario para la Mejora de la Programabilidad en Computacion´ Paralela de Altas Prestaciones TESIS DOCTORAL Andres´ Anton´ Rey Villaverde Dirigida por: Francisco Daniel Igual Pena˜ y Manuel Prieto Mat´ıas Facultad de Informatica´ Universidad Complutense de Madrid Madrid, 2019 ii User-defined Execution Relaxations for Enhanced Programmability in High-Performance Parallel Computing – Relajaciones de Ejecucion´ Definidas por el Usuario para la Mejora de la Programabilidad en Computacion´ Paralela de Altas Prestaciones Memoria que presenta para optar al t´ıtulo de Doctor en Informatica´ Andres´ Anton´ Rey Villaverde Dirigida por los Doctores Francisco Daniel Igual Pena˜ y Manuel Prieto Mat´ıas Facultad de Informatica´ Universidad Complutense de Madrid Madrid, 2019 iv v vi This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-nd/4.0/ or send a letter to Creative Commons, PO Box 1866, Mountain View, CA 94042, USA. I hereby declare that all the content presented in this thesis entitled “User-defined Execution Relaxations for Enhanced Programmability in High-Performance Parallel Computing” has been developed by me, and all other content has been appropriately referenced. Andres´ Anton´ Rey Villaverde This work has been supported by the Spanish Ministry of Innovation, Science and Univer- sities under the grants TIN 2015-65277-R, RTI2018-093684-B-I00 and BES-2016-076806, and the Government of Madrid under the grant S2018/TCS-4423. The associated research internships have been supported by the Erasmus+ International Programme and the HiPEAC Network. vii viii Este trabajo esta´ disponible bajo los terminos´ de la Licencia Internacional Creative Commons Atribucion-NoComercial-SinDerivadas´ 4.0. Para ver una copia de esta licencia, visite http://creativecommons.org/licenses/by-nc-nd/4.0/ o env´ıe una carta a Creative Commons, PO Box 1866, Mountain View, CA 94042, USA. Por la presente, declaro que todo el contenido presentado en esta tesis titulada “Relaja- ciones de Ejecucion´ Definidas por el Usuario para la Mejora de la Programabilidad en Computacion´ Paralela de Altas Prestaciones” ha sido desarrollado por m´ı, y cualquier otro contenido ha sido apropiadamente referenciado. Andres´ Anton´ Rey Villaverde Este trabajo ha sido financiado por el Ministerio de Innovacion,´ Ciencia y Universidades bajo los proyectos TIN 2015-65277-R, RTI2018-093684-B-I00 y BES-2016-076806, y por el Gobierno de la Comunidad de Madrid bajo el proyecto S2018/TCS-4423. Las estancias de investigacion´ asociadas a este trabajo han sido financiadas por el programa de intercam- bio Erasmus+ Internacional y por la Red HiPEAC. ix x To my family. xii A mi familia. xiv Acknowledgements I would like to thank my advisors Fran and Manuel for helping me and counting on me during these years, showing me their trust, support and giving me the proper advices at the proper moments. I also greatly thank their support during my research internships abroad, in which I learned so much, and also their trust to let me explore some rather tangential topics, in relation to the core thesis discipline, which however showed to be crucial during the exploratory scientific process. Thanks to Fran, specially for the valuable feedback received throughout this thesis, and for having helped me so much in the achievement of the thesis objectives, encouraging me to get things done in the most important moments. Thanks to Manuel, also for the accurate advices, specially for opening the ArTeCS doors for me, both the first and the second time, for showing me his trust throughout all of these years, not only during the Ph.D. years, and also for counting on me when Ph.D. sponsoring opportunities appeared. Thanks to my family, for the constant support received not only during the development of this thesis, but also for supporting me in those personal decisions in which I prioritized learning, the personal introspection, and the scientific training over other options presumably more ordinary, expected and less precarious (and probably more boring). I want to thank my mother and father, from whom I learned personality traits very important in life and crucial to finish this Ph.D. degree, such as the culture of effort, the importance of education, persistence, and honesty; and also for having prioritized their sons and daughter over anything, always respecting our independence. I thank my brother for his influence in my education, and for passing his ambition on me, so important to visualize the big picture and to keep the motivation alive. I also thank my sister for the support received from the very beginning of these Ph.D. studies and for always passing her positivity on me. Thanks to Amparo, specially for supporting me and bear me during these last stressful months, and for sharing with me such amazing vacations. Thanks to my Cisneros friends Javier, Juan Miguel, Antonio, Alejandro, Xabier, Pablo and Eliseo for keeping loyal to our (increasingly rare) meetings and (increasingly frequent) weddings. Thanks also to my childhood friends Victoria, Manuel, Valent´ın, Jaime, Iago and Domingo, for being more closer than farther after so many years. Thanks to Agathe, for having unconditionally stayed with me in the beginning of this thesis, both in the good and worse moments, and to Dominic, specially for those conversations (initiated at the end of the world five years ago and still maintained), for passing his idealism and motivation on me, and for his interest in the developments of this thesis. Thanks to my industrial friends Angel´ Rosso, Elena Saiz, Elena Garzon,´ Mar Robledo, Alberto Palomar and Laura Vallejo and to my Impanati / Jamadan friends, Marco, Davide and Javier, for those meetings, beers and rehearsals that helped me so much to get away from the thesis when I needed it the most. Thanks also to the Impanati guys to bear with patience the rehearsal interruptions during my internships abroad. Thanks to the people in the Computer Architecture division, starting from Inaki˜ and Manuel, who opened the ArTeCS doors for me, and specially to all those people I had the pleasure to meet throughout these years, such as Daniel Tabas, Jorge Quintas,´ Roberto Cano, Luis Costero, Nacho Gomez,´ Javier Setoain, Edgardo Mej´ıa, Juan Carlos Saez,´ Luis Pinuel,˜ Christian Tenllado, Fernando Castro, Guillermo Botella, Katzalin Olcoz, Rafael Sanchez,´ Joaqu´ın Recas and Mar´ıa Guijarro. Thanks to Jan Prins for accepting to be my advisor in Chapel Hill, for inviting me to his home, xv xvi and for our discussions and his accurate insights that identified the limitations of my ideas, also helping me to address them. Thanks to the people I met in North Carolina, specially Joshua, Christian and Camila, for the interesting conversations and for making my internship so much fun. Thanks to the Codeplay people, Ruyman, Marya, Peter, Marios, Alex, Gordon, Uwe and Christo- pher for giving me the opportunity to work and learn from them, giving me such an excellent research experience in Edimburgh, which helped me so much to focus my further developments. Recalling my beginnings in computation and simulation worlds, I want to thank the professors that helped me to get experience in numerical methods applied to computational physics. I want to thank Carlos Spa, for giving me the opportunity to work and learn from him in Chile, and for encouraging me to start the Ph.D. studies. I also want to thank V´ıctor Mart´ın, who initiated me in the fascinating world of statistical mechanics (to which I will return); and Leo Gonzalez,´ who initiated me in computational fluid dynamics, appreciating my motivation over my previous experience. I also want to thank the best university professors I had, who have reinforced my passion for learning, also initiating me in the ways of the science, who have greatly inspired me during my studies of engineering and physics. Despite several years have passed, I still have vivid memories of their lectures and the sensations that they awakened on me back then, which years later have somehow guided me toward starting Ph.D. studies. They are Enrique Macia,´ Estrella Alonso, Angela´ Jimenez,´ Juan Pedro Villaluenga, Luis Garay, Felipe Llanes, Jose´ Ramon´ Pelaez and again V´ıctor Mart´ın. I want to thank the reviewers Aleksandar and Ricardo for their valuable suggestions that, together with my advisors, have greatly contributed toward enhancing the quality of the current dissertation. Moreover, I want to thank the anonymous reviewers of the published articles, as their feedback has also guided the research conducted in this thesis. I also want to thank in a general sense the developer communities that are constantly pushing computer technology to new heights in a passionate and idealistic way, either developing open source tools, contributing to programming language standardizations, and also producing documentation and disseminating it in open and free media. Specifically, the developments in this thesis depart to a great extent from the Standard C++ committee works, and some of the proposed ideas exposed in this thesis would not have been possible without all the new functionalities incorporated in modern C++ standards. Agradecimientos Quisiera agradecer a mis directores Fran y Manuel haberme ayudado tanto y contar conmigo durante estos anos,˜ dandome´ la confianza, el apoyo y los consejos en los momentos adecuados.

User-Defined Execution Relaxations for Enhanced Programmability in High-Performance Parallel Computing

Integration of CUDA Processing Within the C++ Library for Parallelism and Concurrency (HPX)

The Importance of Data

Introduction to GPU Computing

Bench - Benchmarking the State-Of- The-Art Task Execution Frameworks of Many- Task Computing

HPX – a Task Based Programming Model in a Global Address Space

Intel® Oneapi Programming Guide

Pattern Matching

Opencl SYCL 2.2 Specification

Lambda Calculus and Functional Programming

Of the Threading Building Blocks Flow Graph API, a C++ API for Expressing Dependency, Streaming and Data Flow Applications

High-Level and Efficient Stream Parallelism on Multi-Core Systems

Purity in Erlang