Advancing Compiler Optimizations for General-Purpose & Domain-Specific Parallel Architectures

ADVANCING COMPILER OPTIMIZATIONS FOR GENERAL-PURPOSE & DOMAIN-SPECIFIC PARALLEL ARCHITECTURES A Dissertation Presented to The Academic Faculty By Prasanth Chatarasi In Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy in the School of Computer Science Georgia Institute of Technology December 2020 Copyright c Prasanth Chatarasi 2020 ADVANCING COMPILER OPTIMIZATIONS FOR GENERAL-PURPOSE & DOMAIN-SPECIFIC PARALLEL ARCHITECTURES Approved by: Dr. Vivek Sarkar, Advisor Dr. Tushar Krishna School of Computer Science School of Electrical and Computer Georgia Institute of Technology Engineering Georgia Institute of Technology Dr. Jun Shirako, Co-Advisor School of Computer Science Dr. Richard Vuduc Georgia Institute of Technology School of Computational Science and Engineering Dr. Santosh Pande Georgia Institute of Technology School of Computer Science Georgia Institute of Technology Date Approved: July 27, 2020 “Dream the impossible. Know that you are born in this world to do something wonderful and unique; don’t let this opportunity pass by. Give yourself the freedom to dream and think big.” — Sri Sri Ravi Shankar To universal consciousness, To my family, To my advisors, mentors, teachers, and friends. ACKNOWLEDGEMENTS Taittiriya Upanishad, Shikshavalli I.20 ma;a:ua;de;va;ea Ba;va ; a;pa:ua;de;va;ea Ba;va A;a;C+a.yRa;de;va;ea Ba;va A; a;ta; a;Ta;de;va;ea Ba;va m¯atrudevo bhava pitrudevo bhava ¯ach¯aryadevo bhava atithidevo bhava “Respects to Mother, Father, Guru, and Guest. They are all forms of God”. First and foremost, I would like to express my gratitude to my mother Ch. Anjanee Devi and my father Dr. C. V. Subbaiah for always being there for me, and loving me unconditionally throughout the situations of extreme happiness to sadness. Without the two of you, I don’t know where I would be. If I have learned anything while being away from you, it is that you are the most important people in my life, and I love you both more than anything. I want to express my sincere appreciation to my guru (advisor) Prof. Vivek Sarkar, who always had time for me when I needed him – no matter whether the reason was a technical discussion, an administrative problem, academic development, or the planning of my next steps. There are no words that can express my gratitude towards your efforts. Thank you for believing in me and supporting me. Without you, I would not be able to handle everything that the graduate program and life threw. Having you backing me up 100% allows me to be at peace and do research. You’re much more than an advisor, and thanks for helping me in my personal life too. Thanks for being my inspiration in the path of being a useful scientific researcher to this world. I am also very grateful to my co-advisor Dr. Jun Shirako, with whom I had many very fruitful discussions on various topics of this thesis. Thank you for always being supportive, even when I feel like I can’t do it. Thank you for being a humble teacher in explaining the answers to my questions – Arigatou Gozaimasu! The lunch sessions were great to discuss my ideas and build a relationship with you. I want to thank the rest of my committee Prof. Tushar Krishna, Prof. Santosh Pande, and Prof. Rich Vuduc for agreeing to be part of my thesis committee. I thank you all for your time, feedback, and suggestions in significantly improving the thesis. I always had a great time interacting with you at the school corridors, labs, and conferences. I am very fortunate to collaborate with Tushar Krishna’s group, and thanks, Tushar, for being an integral part of my Ph.D. journey, including the job search. I’m also very grateful to my collaborators, i.e., Albert Cohen from INRIA/Google AI, Tushar Krishna and Hyoukjun Kwon from Synergy Research Group, Angshuman Parashar and Micheal Pellauer from NVIDIA Architecture Research Group, Stephen Neuendorf- v fer, Samuel Bayliss, and Kees Vissers from Xilinx Research Labs, Karthik Murthy from Google AI, and John Mellor-Crummey from Rice University. Thanks for enriching my research experience in the Ph.D. journey, including asking the right questions, thinking critically, positioning the work, writing research papers, giving a broader perspective of the researcher, and so on. I also want to acknowledge the Habanero Extreme Scale Software Research Group, Synergy Research Group, and CRNCH Research Center (Jeffery Young and Jason Riedy) for the research interactions in my Ph.D. journey. In addition to the collaborators mentioned above, I also want to acknowledge and ap- preciate my mentors, especially Dr. Milind Chabbi, Dr. Shams Imam, Dr. Deepak Majeti, and Dr. Rishi Surendran for all the helpful advice in both research and personal life. I cannot thank you enough for everything you taught me during my Ph.D. journey. I greatly value your kindness and the expertise you imparted to me as my mentors. I want to acknowledge my professor Prof. Kesav Nori for piquing my curiosity about compilers during my undergraduate study at IIT Hyderabad. Without you, I would not be where I am today. Also, thanks are due to my bachelor thesis advisors Dr. Aditya Nori, Prof. M. V. Panduranga Rao and Prof. Bheemarjuna Reddy for giving me a research exposure in the undergraduate study. Thanks to all my professors who encouraged me to apply for graduate studies and quickly provide reference letters. Thanks are due to my friends for all the support and encouragement during my stay at Georgia Tech. There are too many of you to mention, but I would especially like to thank Prithayan Barua, Pramod Chunduri, Poulami Das, Anirudh Jain, Hyoukjun Kwon, Ankush Mandal, Girish Mururu, Mayank Parasar, Sriraj Paul, Christopher Porter, Gururaj Saileshwar, Anand Samajdar, Srisehan Srikanth, Kavitha Sthanam, Lechen Yu, and Jimmy Zhong. Thanks for always being up for a good laugh over the years. I would also like to thank the School of Computer Science department staff members for all the help I received during my stay at GaTech, including Ruthie Book and Wanda Purinton. Also, I want to acknowledge Sharon Riehl for being the coach in my leadership development as part of the Georgia Tech Leading Edge program and helping me to self-analyze and improve in many aspects, including establishing trust and taking and giving construc- tive feedback, and prioritization. Furthermore, an essential aspect of my Ph.D. journey is the association with the SKY meditation club at GaTech and also the Art of Living founda- tion. The association helped me much towards the spiritual path and also managing of my mental health – in that regards, I want to acknowledge many people, especially my teachers Sareena Nagpal, Rachana Gadhok, Kunwar Gadhok, Preeti Bhatt. The acknowledgments never end without mentioning siblings, i.e., my elder sister Sree Pavani and my elder brother Sreenivas. First and for most, life may not be that exciting vi without you. We may fight 50% of the time, but I love you a lot for being my best friends, toughening up during tough situations of my life, and celebrating with me during happy moments. As per a Vietnamese proverb, you both are as close as my hands and feet. Also, my life has been more beautiful with my two nieces, i.e., Venkata Tejaswini and Sai Sughandhini – I always felt content hearing the word “mamayya” from them. “Gratitude to the entire creation for showering a lot of love, grace, and blessings!”. vii TABLE OF CONTENTS List of Tables . xiv List of Figures . xvii Summary . xxii Chapter 1: Introduction . 1 1.1 Thesis Statement . 4 1.2 Contributions . 4 1.3 Organization . 8 Chapter 2: Background . 10 2.1 Parallel Architectures . 10 2.1.1 General-Purpose Parallel Architectures . 10 Multi-Core Architectures . 10 Vector Processing (SIMD) Architectures . 12 2.1.2 Domain-Specific Parallel Architectures . 13 Spatial DNN Accelerators . 13 Specialized Vector Processing Units . 14 Thread Migratory Architectures . 15 viii 2.2 High-Performance Applications . 17 2.2.1 Scientific Computing Applications . 17 2.2.2 Deep Learning . 17 2.2.3 Graph Analytics . 19 2.3 Summary . 19 Chapter 3: PoPP: Polyhedral Optimizations of Explicitly-Parallel Programs . 20 3.1 Abstract . 20 3.2 Introduction . 21 3.3 Background . 22 3.3.1 Polyhedral Model . 22 3.3.2 Explicit Parallelism . 23 Loop-level parallelism . 24 Task-level parallelism including dependences . 24 3.4 Motivating Examples . 24 3.4.1 2-D Jacobi . 25 3.4.2 Particle Filter . 26 3.5 Polyhedral optimizer for Parallel Programs (PoPP) . 26 3.5.1 Conservative analysis . 28 3.5.2 Extraction of happens-before relations . 30 Loop-level parallelism . 30 Task parallelism including dependences . 31 3.5.3 Reflection of happens-before relations . 33 ix 3.5.4 PolyAST: a loop optimizer integrating polyhedral and AST-based transformations . 34 3.6 Experimental Evaluation . 35 3.6.1 Experimental setup . 35 3.6.2 KASTORS Suite . 39 3.6.3 Rodinia Suite . 40 3.7 Limitations . 41 3.8 Related Work . 42 3.9 Summary . 43 Chapter 4: PolySIMD: A Unified Approach to Variable Renaming for Enhanced Vectorization . 45 4.1 Abstract . 45 4.2 Introduction . 46 4.3 Discussion on Variable Renaming Transformations . 48 4.3.1 Source Variable Renaming (SoVR) . 49 4.3.2 Sink Variable Renaming (SiVR) . 50 4.3.3 Synergy between SoVR and SiVR . 51 4.4 Motivating Example . 51 4.5 Our Unified Approach to Variable Renaming . 53 4.5.1 Dependence Cycles Finder . 53 4.5.2 Bipartite Graph Constructor . 55 4.5.3 Solver . 55 4.5.4 Transformer . 57 x 4.5.5 Bounding Additional Space .

Advancing Compiler Optimizations for General-Purpose & Domain-Specific Parallel Architectures

SOL: Effortless Device Support for AI Frameworks Without Source Code Changes

Training Professionals and Researchers on Future Intelligent On-Board Embedded Heterogeneous Processing

Plaidml Documentation

Machine Learning Framework for Petrochemical Process Industry

Fast Geometric Learning with Symbolic Matrices

Triton: an Intermediate Language and Compiler for Tiled Neural Network Computations

Benchmarking Contemporary Deep Learning Hardware and Frameworks: a Survey of Qualitative Metrics Wei Dai, Daniel Berleant

Benchmarking Contemporary Deep Learning Hardware And

Stripe: Tensor Compilation Via the Nested Polyhedral Model

Comparing the Costs of Abstraction for DL Frameworks

Towards Transparent Neural Network Acceleration

Package 'Keras'